Russell Bryant [Thu, 20 Mar 2008 21:54:58 +0000 (21:54 +0000)]
Merged revisions 110335 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.2
........
r110335 | russell | 2008-03-20 16:53:27 -0500 (Thu, 20 Mar 2008) | 6 lines
Fix some very broken code that was introduced in 1.2.26 as a part of the security
fix. The dnsmgr is not appropriate here. The dnsmgr takes a pointer to an address
structure that a background thread continuously updates. However, in these cases,
a stack variable was passed. That means that the dnsmgr thread would be continuously
writing to bogus memory.
Joshua Colp [Wed, 19 Mar 2008 19:11:33 +0000 (19:11 +0000)]
Add sanity checking for position resuming. We *have* to make sure that the position does not exceed the total number of files present, and we have to make sure that the position's filename is the same as previous. These values can change if a music class is reloaded and give unpredictable behavior.
(closes issue #11663)
Reported by: junky
Joshua Colp [Wed, 19 Mar 2008 18:20:28 +0000 (18:20 +0000)]
Make sure that the mark bit does not incorrectly cause video frame timestamps to be calculated as if they are audio frames.
(closes issue #11429)
Reported by: sperreault
Patches:
11429-frametype.diff uploaded by qwell (license 4)
I didn't give tzafrir very much time to test this, but if he does
still have remaining issues, he is welcome to
re-open this bug, and we'll do what is called for.
I reproduced the problem, and tested the fix, so I hope I
am not jumping by just going ahead and committing the fix.
The problem was with what file_save does with templates;
firstly, it tended to print out multiple options:
[my_category](!)(templateref)
instead of
[my_category](!,templateref)
which is fixed by this patch.
Nextly, the code to suppress output of duplicate declarations
that would occur because the reader copies inherited declarations
down the hierarchy, was not working. Thus:
[template](!,master-template)
mastervar = bar
tvar = value
[cat](template)
mastervar = bar
tvar = value
catvar = val
This has been fixed. Since the config reader 'explodes' inherited
vars into the category, users may, in certain circumstances, see
output different from what they originally entered, but it should
be both correct and equivalent.
Mark Michelson [Tue, 18 Mar 2008 20:52:15 +0000 (20:52 +0000)]
This patch makes it so that all queue member status changes are handled through device state
code. This removes several problems people were seeing where their queue members would get into
an "unknown" state. Huge props go to atis on this one since he was the one who found the code
section that was causing the problem and proposed the solution. I just wrote what he suggested :)
Steve Murphy [Tue, 18 Mar 2008 06:37:15 +0000 (06:37 +0000)]
(closes issue #11903)
Reported by: atis
Many thanks to atis for spotting this problem and reporting it.
The fix was to straighten out how items are placed on and removed
from the file stack. Regressions as well as the provided test case
helped to straighten out all code paths. valgrind was used to make
sure all memory allocated was freed.
Sorry for not solving this earlier. I got distracted.
Added the ntest23 regression test, which is mainly a copy of ntest22,
but with a few juicy errors thrown in, to replicate the kind of
error that atis spotted.
Mark Michelson [Mon, 17 Mar 2008 22:05:49 +0000 (22:05 +0000)]
Fix a logic flaw in the code that stores lock info which is displayed
via the "core show locks" command. The idea behind this section of code was
to remove the previous lock from the list if it was a trylock that had failed.
Unfortunately, instead of checking the status of the previous lock, we were referencing
the index immediately following the previous lock in the lock_info->locks array.
The result of this problem, under the right circumstances, was that the lock which
we currently in the process of attempting to acquire could "overwrite" the previous lock
which was acquired. While this does not in any way affect typical operation, it *could*
lead to misleading "core show locks" output.
Joshua Colp [Mon, 17 Mar 2008 16:24:29 +0000 (16:24 +0000)]
200 OKs in response to a reinvite need to be sent reliably. If the remote side does not receive one the dialog will be torn down.
(closes issue #12208)
Reported by: atrash
Mark Michelson [Fri, 14 Mar 2008 16:44:08 +0000 (16:44 +0000)]
Fix a race condition in the SIP packet scheduler which could cause a crash.
chan_sip uses the scheduler API in order to schedule retransmission of reliable
packets (such as INVITES). If a retransmission of a packet is occurring, then the
packet is removed from the scheduler and retrans_pkt is called. Meanwhile, if
a response is received from the packet as previously transmitted, then when we
ACK the response, we will remove the packet from the scheduler and free the packet.
The problem is that both the ACK function and retrans_pkt attempt to acquire the
same lock at the beginning of the function call. This means that if the ACK function
acquires the lock first, then it will free the packet which retrans_pkt is about to
read from and write to. The result is a crash.
The solution:
1. If the ACK function fails to remove the packet from the scheduler and the retransmit
id of the packet is not -1 (meaning that we have not reached the maximum number of
retransmissions) then release the lock and yield so that retrans_pkt may acquire the
lock and operate.
2. Make absolutely certain that the ACK function does not recursively lock the lock in
question. If it does, then releasing the lock will do no good, since retrans_pkt will
still be unable to acquire the lock.
Russell Bryant [Thu, 13 Mar 2008 21:38:16 +0000 (21:38 +0000)]
Fix another issue that was causing crashes in chanspy. This introduces a new
datastore callback, called chan_fixup(). The concept is exactly like the
fixup callback that is used in the channel technology interface. This callback
gets called when the owning channel changes due to a masquerade. Before this
was introduced, if a masquerade happened on a channel being spyed on, the
channel pointer in the datastore became invalid.
(closes issue #12187)
(reported by, and lots of testing from atis)
(props to file for the help with ideas)
Russell Bryant [Thu, 13 Mar 2008 21:06:33 +0000 (21:06 +0000)]
Make a tweak that gets the LEDs on polycom phones to blink when an extension that
has been subscribed to goes on hold. Otherwise, they just stay on like it does
when an extension is in use.
(closes issue #11263)
Reported by: russell
Patches:
notify_hold.rev1.txt uploaded by russell (license 2)
Tested by: russell
Russell Bryant [Thu, 13 Mar 2008 20:26:28 +0000 (20:26 +0000)]
Fix a couple uses of sprintf. The second one could actually cause an overflow
of a stack buffer. It's not a security issue though, it only depends on your
configuration.
Mark Michelson [Wed, 12 Mar 2008 21:53:46 +0000 (21:53 +0000)]
Change AST_SCHED_DEL use to ast_sched_del for autocongestion in chan_sip.
The scheduler callback will always return 0. This means that this id
is never rescheduled, so it makes no sense to loop trying to delete
the id from the scheduler queue. If we fail to remove the item from the
queue once, it will fail every single time.
(Yes I realize that in this case, the macro would exit early because the
id is set to -1 in the callback, but it still makes no sense to use
that macro in favor of calling ast_sched_del once and being done with it)
This is the first of potentially several such fixes.
Mark Michelson [Wed, 12 Mar 2008 21:16:28 +0000 (21:16 +0000)]
Added a large comment before the AST_SCHED_DEL macro to explain its purpose as well as when
it is appropriate and when it is not appropriate to use it.
I also removed the part of the debug message that mentions that this is probably a bug because
there are some perfectly legitimate places where ast_sched_del may fail to delete an entry (e.g.
when the scheduler callback manually reschedules with a new id instead of returning non-zero to
tell the scheduler to reschedule with the same idea). I also raised the debug level of the debug
message in AST_SCHED_DEL since it seems like it could come up quite frequently since the macro
is probably being used in several places where it shouldn't be. Also removed the redundant line,
file, and function information since that is provided by ast_log.
Russell Bryant [Wed, 12 Mar 2008 19:57:42 +0000 (19:57 +0000)]
(closes issue #12187, reported by atis, fixed by me after some brainstorming
on the issue with mmichelson)
- Update copyright info on app_chanspy.
- Fix a race condition that caused app_chanspy to crash. The issue was that
the chanspy datastore magic that was used to ensure that spyee channels did
not disappear out from under the code did not completely solve the problem.
It was actually possible for chanspy to acquire a channel reference out of
its datastore to a channel that was in the middle of being destroyed. That
was because datastore destruction in ast_channel_free() was done near the
end. So, this left the code in app_chanspy accessing a channel that was
partially, or completely invalid because it was in the process of being free'd
by another thread. The following sort of shows the code path where the race
occurred:
=============================================================================
Thread 1 (PBX thread for spyee chan) || Thread 2 (chanspy)
--------------------------------------||-------------------------------------
ast_channel_free() ||
- remove channel from channel list ||
- lock/unlock the channel to ensure ||
that no references retrieved from ||
the channel list exist. ||
--------------------------------------||-------------------------------------
|| channel_spy()
- destroy some channel data || - Lock chanspy datastore
|| - Retrieve reference to channel
|| - lock channel
|| - Unlock chanspy datastore
--------------------------------------||-------------------------------------
- destroy channel datastores ||
- call chanspy datastore d'tor ||
which NULL's out the ds' || - Operate on the channel ...
reference to the channel ||
||
- free the channel ||
||
|| - unlock the channel
--------------------------------------||-------------------------------------
=============================================================================
Kevin P. Fleming [Wed, 12 Mar 2008 19:16:07 +0000 (19:16 +0000)]
if we receive an INVITE with a Content-Length that is not a valid number, or is zero, then don't process the rest of the message body looking for an SDP
Joshua Colp [Wed, 12 Mar 2008 18:26:37 +0000 (18:26 +0000)]
Add a trigger mode that triggers on both read and write. The actual function that returns the combined audio frame though will wait until both sides have fed in audio, or until one side stops (such as the case when you call Wait).
(closes issue #11945)
Reported by: xheliox
Joshua Colp [Tue, 11 Mar 2008 19:20:01 +0000 (19:20 +0000)]
Make sure the visible indication is on the right channel so when the masquerade happens the proper indication is enacted.
(closes issue #11707)
Reported by: iam
Joshua Colp [Tue, 11 Mar 2008 18:47:33 +0000 (18:47 +0000)]
Add an additional check for setting conference parameter when using the marked user options. It was possible for it to return to a no listen/no talk state if a masquerade happened.
(closes issue #12136)
Reported by: aragon
Kevin P. Fleming [Tue, 11 Mar 2008 11:04:29 +0000 (11:04 +0000)]
fix up various compiler warnings found with gcc-4.3:
- the output of flex includes a static function called 'input' that is not used, so for the moment we'll stop having the compiler tell us about unused variables in the flex source files (a better fix would be to improve our flex post-processing to remove the unused function)
- main/stdtime/localtime.c makes assumptions about signed integer overflow, and gcc-4.3's improved optimizer tries to take advantage of handling potential overflow conditions at compile time; for now, suppress these optimizations until we can fiure out if the code needs improvement
- main/udptl.c has some references to uninitialized variables; in one case there was no bug, but in the other it was certainly possibly for unexpected behavior to occur
Russell Bryant [Mon, 10 Mar 2008 20:17:11 +0000 (20:17 +0000)]
Fix another bug specifically related to asynchronous call origination. Once the
PBX is started on the channel using ast_pbx_start(), then the ownership of the
channel has been passed on to another thread. We can no longer access it in this
code. If the channel gets hung up very quickly, it is possible that we could
access a channel that has been free'd.
Russell Bryant [Mon, 10 Mar 2008 20:04:27 +0000 (20:04 +0000)]
Fix some bugs related to originating calls. If the code failed to start a PBX
on the channel (such as if you set a call limit based on the system's load
average), then there were cases where a channel that has already been free'd
using ast_hangup() got accessed. This caused weird memory corruption and
crashes to occur.
(fixes issue BE-386)
(much debugging credit goes to twilson, final patch written by me)
don't generate D-Channel "up" and "down" messages unless the channel state is actually changing; also, generate the "up" message when an implicit "up" occurs due to reception of a normal event when we thought the channel was "down"
Russell Bryant [Fri, 7 Mar 2008 17:16:58 +0000 (17:16 +0000)]
Change a warning message to a debug message. This is happening quite frequently,
and it is not worth spamming users with these messages unless we are pretty confident
that it should never happen. As it stands today, it _will_ and _does_ happen and
until that gets cleaned up a reasonable amount on the development side, let's not
spam the logs of everyone else.
Mark Michelson [Thu, 6 Mar 2008 22:10:07 +0000 (22:10 +0000)]
Quell an annoying message that is likely to print every single time that
ast_pbx_outgoing_app is called. The reason is that __ast_request_and_dial
allocates the cdr for the channel, so it should be expected that the channel
will have a cdr on it.
Joshua Colp [Wed, 5 Mar 2008 22:32:10 +0000 (22:32 +0000)]
Add a control frame to indicate the source of media has changed. Depending on the underlying technology it may need to change some things.
(closes issue #12148)
Reported by: jcomellas
document var_metric so no bugreports will come in when it's actually a configuration issue.
(issue #12151)
Reported and patched by: caio1982
1.4 patch by me
when a PRI call must be moved to a different B channel at the request of the other endpoint, ensure that any DSP active on the original channel is moved to the new one
Tilghman Lesher [Wed, 5 Mar 2008 15:17:16 +0000 (15:17 +0000)]
Correctly initialize retransid in SIP, and ensure that the warning when failing to delete a schedule entry can actually hit the log.
(closes issue #12140)
Reported by: slavon
Patches:
sch2.patch uploaded by slavon (license 288)
(Patch slightly modified by me)
Russell Bryant [Wed, 5 Mar 2008 01:52:18 +0000 (01:52 +0000)]
Fix a bug that I just noticed in the RTP code. The calculation for setting the
len field in an ast_frame of audio was wrong when G.722 is in use. The len field
represents the number of ms of audio that the frame contains. It would have
set the value to be twice what it should be.
Joshua Colp [Tue, 4 Mar 2008 18:05:28 +0000 (18:05 +0000)]
When a new source of audio comes in (such as music on hold) make sure the marker bit gets set.
(closes issue #10355)
Reported by: wdecarne
Patches:
10355.diff uploaded by file (license 11)
(closes issue #11491)
Reported by: kanderson
Russell Bryant [Tue, 4 Mar 2008 04:31:29 +0000 (04:31 +0000)]
Backport a minor bug fix from trunk that I found while doing random code
cleanup. Properly break out of the loop when a context isn't found when
verify that includes are valid.
Russell Bryant [Mon, 3 Mar 2008 17:05:16 +0000 (17:05 +0000)]
Fix a potential memory leak of the local_pvt struct when ast_channel allocation
fails. Also, in passing, centralize the code necessary to destroy a local_pvt.
Russell Bryant [Mon, 3 Mar 2008 15:50:43 +0000 (15:50 +0000)]
Merge in some changes from team/russell/autoservice-nochans-1.4
These changes fix up some dubious code that I came across while auditing what
happens in the autoservice thread when there are no channels currently in
autoservice.
1) Change it so that autoservice thread doesn't keep looping around calling
ast_waitfor_n() on 0 channels twice a second. Instead, use a thread condition
so that the thread properly goes to sleep and does not wake up until a
channel is put into autoservice.
This actually fixes an interesting bug, as well. If the autoservice thread
is already running (almost always is the case), then when the thread goes
from having 0 channels to have 1 channel to autoservice, that channel would
have to wait for up to 1/2 of a second to have the first frame read from it.
2) Fix up the code in ast_waitfor_nandfds() for when it gets called with no
channels and no fds to poll() on, such as was the case with the previous code
for the autoservice thread. In this case, the code would call alloca(0), and
pass the result as the first argument to poll(). In this case, the 2nd
argument to poll() specified that there were no fds, so this invalid pointer
shouldn't actually get dereferenced, but, this code makes it explicit and
ensures the pointers are NULL unless we have valid data to put there.
Joshua Colp [Mon, 3 Mar 2008 15:28:59 +0000 (15:28 +0000)]
It is possible for no audio to pass between the current digit and next digit so expand logic that clears emulation to AST_FRAME_NULL.
(closes issue #11911)
Reported by: edgreenberg
Patches:
v1-11911.patch uploaded by dimas (license 88)
Tested by: tbsky
Joshua Colp [Mon, 3 Mar 2008 15:15:39 +0000 (15:15 +0000)]
Add a comment to describe some logic.
(closes issue #12120)
Reported by: flefoll
Patches:
chan_sip.c.br14.patch-just-a-comment uploaded by flefoll (license 244)
Russell Bryant [Fri, 29 Feb 2008 23:34:32 +0000 (23:34 +0000)]
Fix a major bug in autoservice. There was a race condition in the handling of
the list of channels in autoservice. The problem was that it was possible for
a channel to get removed from autoservice and destroyed, while the autoservice
thread was still messing with the channel. This led to memory corruption, and
caused crashes. This explains multiple backtraces I have seen that have
references to autoservice, but do to the nature of the issue (memory corruption),
could cause crashes in a number of areas.
(fixes the crash in BE-386)
(closes issue #11694)
(closes issue #11940)
The following issues could be related. If you are the reporter of one of these,
please update to include this fix and try again.
Russell Bryant [Thu, 28 Feb 2008 22:23:05 +0000 (22:23 +0000)]
Fix a bug in the lock tracking code that was discovered by mmichelson. The issue
is that if the lock history array was full, then the functions to mark a lock as
acquired or not would adjust the stats for whatever lock is at the end of the array,
which may not be itself. So, do a sanity check to make sure that we're updating
lock info for the proper lock.
(This explains the bizarre stats on lock #63 in BE-396, thanks Mark!)
Jason Parker [Thu, 28 Feb 2008 19:20:10 +0000 (19:20 +0000)]
Make pbx_exec pass an empty string into applications, if we get NULL.
This protects against possible segfaults in applications that may try
to use data before checking length (ast_strdupa'ing it, for example)
Mark Michelson [Wed, 27 Feb 2008 21:49:20 +0000 (21:49 +0000)]
Two fixes:
1. Make the list of ast_dial_channels a lockable list. This is because in some cases,
the ast_dial may exist in multiple threads due to asynchronous execution of its application, and
I found some cases where race conditions could exist.
2. Check in ast_dial_join to be sure that the channel still exists before attempting to lock it, since
it could have gotten hung up but the is_running_app flag on the ast_dial_channel may not have been
cleared yet.
Russell Bryant [Wed, 27 Feb 2008 17:33:04 +0000 (17:33 +0000)]
Fix a problem in ChanSpy where it could get stuck in an infinite loop without
being able to detect that the calling channel hung up.
(closes issue #12076, reported by junky, patched by me)