Mark Michelson [Thu, 9 Apr 2009 21:06:26 +0000 (21:06 +0000)]
Add a new option, mwi_from, to sip.conf.
This allows for you to change the From header for outgoing MWI
NOTIFY requests. Prior to this, the best you could do was to
set a callerid in the general section of sip.conf. The problem
was that this was used for all outbound requests, not just
MWI NOTIFY requests.
Jeff Peeler [Thu, 9 Apr 2009 19:10:02 +0000 (19:10 +0000)]
Add ability for dialplan execution to continue when caller hangs up.
The F option to app_dial has been modified to accept no parameters and perform
the above functionality. I don't see anywhere else that is doing function
overloading, but this really is the best place for this operation because:
- It makes it close to the 'g' option in the argument list which provides
similar functionality.
- The existing code to support the current F option provides a very
convienient location to add this new feature.
Handle a SIP race condition (reinvite before an ACK) properly.
RFC 5047 explains the proper course of action to take if a
reINVITE is received before the ACK from a previous invite
transaction. What we are to do is to treat the reINVITE as
if it were both an ACK and a reINVITE and process it normally.
Later, when we receive the ACK we had been expecting, we will
ignore it since its CSeq is less than the current iseqno of
the sip_pvt representing this dialog.
Race condition between ast_cli_command() and 'module unload' could cause a deadlock.
Add lock timeouts to avoid this potential deadlock.
(closes issue #14705)
Reported by: jamessan
Patches:
20090320__bug14705.diff.txt uploaded by tilghman (license 14)
Tested by: jamessan
........
David Vossel [Thu, 9 Apr 2009 17:39:10 +0000 (17:39 +0000)]
Fixes deadlock caused by calling get_cid_name with chan locked.
get_cid_name should not be called with a channel lock. get_cid_name calls ast_get_hint which eventually calls pbx_find_extension. pbx_find_extension starts and stops autoservice which should not be done with a channel lock, so get_cid_name should not be called with one.
Mark Michelson [Thu, 9 Apr 2009 17:30:39 +0000 (17:30 +0000)]
Fix a crash in res_musiconhold when using cached realtime moh.
The moh_register function links an mohclass and then immediately
unrefs the class since the container now has a reference. The problem
with using realtime music on hold is that the class is allocated,
registered, and started in one fell swoop. The refcounting logic
resulted in the count being off by one. The same problem did not
happen when using a static config because the allocation and registration
of an mohclass is a separate operation from starting moh. This also did
not affect non-cached realtime moh because the classes are not registered
at all.
I also have modified res_musiconhold to use the _t_ variants of the ao2_
functions so that more info can be gleaned when attempting to trace the
refcounts. I found this to be incredibly helpful for debugging this issue
and there's no good reason to remove it.
Add support for allowing the channel driver to handle transcoding.
This was accomplished using a set of options and the setoption channel callback.
The core calls into the channel driver using these options and the channel driver
either returns success or failure.
add a dedicated log channel for modules to be able report security-related events, so that they can be fed into external processes for analysis and possible mitigation efforts
(inspired by this evening's Toronto Asterisk Users Group meeting and previous dicussions amongst various community members)
Jeff Peeler [Wed, 8 Apr 2009 21:00:39 +0000 (21:00 +0000)]
Add timer for features so that backup bridge config can go away
The biggest change done here was elimination of the backup_config for use with
features. Previously, the bridging code upon detecting a feature would set the
start time of the bridge to the start time of the feature. Then after the
feature had either expired or timed out the start time would be reset to the
true bridge start time from the backup_config. Now, the time differences are
calculated with respect to the newly added feature_start_time timeval instead.
There should be no behavior changes from the previous functionality aside from
the bridge timing being unaffected by either valid or partial feature matches.
Previously the timing would be increased by the length of time configured for
featuredigittimeout, which was probably never noticed.
Backport resolution for file descriptor leak in 1.6.0 to 1.4.
This fixes short reads in http manager sessions, such as those done by the
ast-gui branch. (Fixes AST-198)
........
If the first column is empty, output a delimiter anyway.
(closes issue #14848)
Reported by: john8675309
Patches:
20090408__bug14848.diff.txt uploaded by tilghman (license 14)
Tested by: john8675309
Fix a small logical error when loading moh classes.
We were unconditionally incrementing the number of mohclasses
registered. However, we should actually only increment if the
call to moh_register was successful.
While this probably has never caused problems, I noticed it
and decided to fix it anyway.
........
Make a couple of changes with regards to a new message printed in ast_read().
"ast_read() called with no recorded file descriptor" is a new message added
after a bug was discovered. Unfortunately, it seems there are a bunch of places
that potentially make such calls to ast_read() and trigger this error message
to be displayed. This commit does two things to help to make this message appear
less.
First, the message has been downgraded to a debug level message if dev mode is
not enabled. The message means a lot more to developers than it does to end users,
and so developers should take an effort to be sure to call ast_read only when
a channel is ready to be read from. However, since this doesn't actually cause an
error in operation and is not something a user can easily fix, we should not spam
their console with these messages.
Second, the message has been moved to after the check for any pending masquerades.
ast_read() being called with no recorded file descriptor should not interfere with
a masquerade taking place.
This could be seen as a simple way of resolving issue #14723. However, I still want
to try to clear out the existing ways of triggering this message, since I feel that
would be a better resolution for the issue.
........
Russell Bryant [Wed, 8 Apr 2009 13:24:48 +0000 (13:24 +0000)]
Start splitting up miscellaneous doxygen documentation into separate files.
doxyref.h was created to hold miscellaneous documentation that was not specific
to a part of the code. This file has grown quite a bit so I decided to start
splitting parts of it out into new files. Now, you can drop a new file into
include/asterisk/doxygen/ and it will be processed by doxygen.
Russell Bryant [Wed, 8 Apr 2009 12:35:57 +0000 (12:35 +0000)]
Update some comments and resolve potential memory corruption in chan_sip.
While browsing chan_sip the other day, I noticed this dangerous code in
dialog_needdestroy(). This function is an ao2_callback. It is absolutely
_not_ okay to unlock the container from within this function. It's also not
clear why it was useful. Given that it could cause memory corruption, I have
removed it.
There was also a TODO comment left describing a potential implementation of
an improvement to the needdestroy handling. I'm not convinced that what was
described is the best choice here, so I have briefly described the way that
this function is used today that could be improved.
Add support for changing the outbound codec on a SIP call using
a dialplan variable.
This adds a dialplan variable (SIP_CODEC_OUTBOUND) which controls
the codec offered for an outgoing SIP call. This is much like the
SIP_CODEC dialplan variable and has the same restrictions. The codec
set must be one that is configured for the call.
Mark Michelson [Fri, 3 Apr 2009 22:41:46 +0000 (22:41 +0000)]
This commit introduces COLP/CONP and Redirecting party information into Asterisk.
The channel drivers which have been most heavily tested with these enhancements are
chan_sip and chan_misdn. Further work is being done to add Q.SIG support and will be
introduced in a later commit. chan_skinny has code added to it here, but according
to user pj, the support on chan_skinny is not working as of now. This will be fixed in
a later commit.
A special thanks goes out to bugtracker user gareth for getting the ball rolling and
providing the initial support for this work. Without his initial work on this, this would
not have been nearly as painless as it was.
This functionality has been tested by Digium's product quality department, as well as a
customer site running thousands of calls every day. In addition, many many many many bugtracker
users have tested this, too.
Fix a bug where DAHDI/Zaptel channels would not properly switch formats when requested
Don't offer AST_FORMAT_SLINEAR on DAHDI/Zaptel channels... while it could provide a slight performance benefit, the translation core in Asterisk has some flaws when a channel driver offers multiple raw formats. this fix is much simpler than fixing the translation core to solve that issue (although that will be done later).
........
Distinguish in a sent email between simple sends and forwards.
(closes issue #11678)
Reported by: jamessan
Patches:
20090330__bug11678.diff.txt uploaded by tilghman (license 14)
Tested by: tilghman, lmadsen
........
Add better support for relaying success or failure of the ast_transfer() API call.
This API call now waits for a special frame from the underlying channel driver to
indicate success or failure. This allows the return value to truly convey whether
the transfer worked or not. In the case of the Transfer() dialplan application this
means the value of the TRANSFERSTATUS dialplan variable is actually true.
David Vossel [Fri, 3 Apr 2009 16:29:47 +0000 (16:29 +0000)]
audio_audiohook_write_list() did not correctly update sample size after ast_translate.
audio_audiohook_write_list() did not take into account that the sample size may change after translation depending on if the original frame is is 8khz or 16khz. the sample size is now updated after translating to reflect this possibility. This caused the audio on the receiving end to sound terrible. Thanks to jcolp and mmichelson for helping me work this out.
Mark Michelson [Fri, 3 Apr 2009 14:32:05 +0000 (14:32 +0000)]
Fix the ability to retrieve voicemail messages from IMAP.
A recent change made interactive vm_states no longer get
added to the list of vm_states and instead get stored in
thread-local storage.
In trunk and all the 1.6.X branches, the problem is that
when we search for messages in a voicemail box, we would
attempt to update the appropriate vm_state struct by directly
searching in the list of vm_states instead of using the
get_vm_state_by_imap_user function. This meant we could not
find the interactive vm_state that we wanted.
I came across this while doing some testing of my ast_channel_ao2 branch.
After running a test overnight that generated over 5 million calls, Asterisk
had taken up about 1 GB of my system memory. So, I re-ran the test with
MALLOC_DEBUG turned on. However, it showed no leaks in Asterisk during the
test, even though Asterisk was still consuming it somehow.
Instead, I turned to valgrind, which when run with --leak-check=full, told
me exactly where the leak came from, which was from allocations inside the
radiusclient-ng library. This explains why MALLOC_DEBUG did not report it.
After a bit of analysis, I found that we were leaking a little bit of memory
every time a CDR record was passed to cdr_radius.
I don't actually have a radius server set up to receive CDR records. However,
I always have my development systems compile and install all modules. In
addition to making sure there are not build errors across modules, always
loading modules helps find bugs like this, too, so it is strongly recommend for
all developers.
This API provides a generic way for multiple RTP stacks to be
integrated into Asterisk. Right now there is only one present, res_rtp_asterisk,
which is the existing Asterisk RTP stack. Functionality wise this commit
performs the same as previously. API documentation can be viewed in the
rtp_engine.h header file.
Missed a common case for needing to extend the buffer.
(closes issue #14716)
Reported by: sum
Patches:
20090402__bug14716.diff.txt uploaded by tilghman (license 14)
Tested by: sum
the DAHDI_GETCONF, DAHDI_SETCONF and DAHDI_GET_PARAMS ioctls were recently corrected to show that they do, in fact, read data from userspace as part of their work. due to this fix, valgrind now reports a number of cases where chan_dahdi passed an uninitialized (or partially) buffer to these ioctls, which could lead to unexpected behavior.
this patch corrects chan_dahdi to ensure that buffers passed to these ioctls are always fully initialized.
........
Merge changes from str_substitution that are unrelated to that branch.
Included is a small bugfix to an ast_str helper, but most of these changes
are simply doxygen fixes.
Fixes issue with dropped calles due to re-Invite glare and re-Invites never executing after a 491
Acknowledgement for 491 responses were never being processed because it didn't match our pending invite's seqno. Since the ACK was never processed, the 491 frame would continue to be retransmitted until eventually the call was dropped due to max retries. Now during a pending invite, if we receive another invite, we send an 491 and hold on to that glare invite's seqno in the "glareinvite" variable for that sip_pvt struct. When ACK's are received, we first check to see if it is in response to our pending invite, if not we check to see if it is in response to a glare invite. In this case, it is in response to the glare invite and must be dealt with or the call is dropped. I've changed the wait time for resending the re-Invite after receving a 491 response to comply with RFC 3261. Before this patch the scheduled re-Invite would only change a flag indicating that the re-Invite should be sent out, now it actually sends it out as well.
This change fixes a situation where an audiohook that wants DTMF would not
actually get it. This is in the code path where we end DTMF digit length
emulation while handling a NULL frame.
Kevin P. Fleming [Tue, 31 Mar 2009 21:29:50 +0000 (21:29 +0000)]
Optimizations to the stringfields API
This patch provides a number of optimizations to the stringfields API, focused around saving (not wasting) memory whenever possible. Thanks to Mark Michelson for inspiring this work and coming up with the first two optimizations that are represented here:
Changes:
- Cleanup of some code, fix incorrect doxygen comments
- When a field is emptied or replaced with a new allocation, decrease the amount of 'active' space in the pool it was held in; if that pool reaches zero active space, and is not the current pool, then free it as it is no longer in use
- When allocating a pool, try to allocate a size that will fit in a 'standard' malloc() allocation without wasting space
- When allocating space for a field, store the amount of space in the two bytes immediately preceding the field; this eliminates the need to call strlen() on the field when overwriting it, and more importantly it 'remembers' the amount of space the field has available, even if a shorter string has been stored in it since it was allocated
- Don't automatically double the size of each successive pool allocated; it's wasteful
Russell Bryant [Tue, 31 Mar 2009 19:07:58 +0000 (19:07 +0000)]
Improve performance of the code handling the frame queue in chan_iax2.
In my tests that exercised full frame handling in chan_iax2, the version with
these changes took 30% to 40% of the CPU time compared to the same test of
Asterisk trunk before these modifications.
While doing some profiling for <http://reviewboard.digium.com/r/205/>,
one function that caught my eye was network_thread() in chan_iax2.c.
After the things that I was working on there, it was the next target
for analysis and optimization. I used oprofile's source annotation
functionality and found that the loop traversing the frame queue in
network_thread() was to blame for the excessive CPU cycle consumption.
The frame_queue in chan_iax2 previously held all frames that either were
pending transmission or had been transmitted and are still pending
acknowledgment.
In network_thread(), the previous code would go back through the main
for loop after reading a single incoming frame or after being signaled
because a frame had been queued up for initial transmission. In each
iteration of the loop, it traverses the entire frame queue looking for
frames that need to be transmitted. On a busy server, this could easily
be quite a few entries.
This patch is actually quite simple. The frame_queue has become only a list
of frames pending acknowledgment. Frames that need to be transmitted are
queued up to a dedicated transmit thread via the taskprocessor API.
As a result, the code in network_thread() becomes much simpler, as its only
job is to read incoming frames.
In addition to the previously described changes, this patch includes some
additional changes to the frame_queue. Instead of one big frame_queue, now
there is a list per call number to further reduce wasted list traversals.
The biggest impact of this change is in socket_process().
For additional details on testing and test results, see the review request.
Fix incorrect parsing in chan_gtalk when xmpp contains extra whitespaces
To drill into the xmpp to find the capabilities between channels, chan_gtalk
calls iks_child() and iks_next(). iks_child() and iks_next() are functions in
the iksemel xml parsing library that traverse xml nodes. The bug here is that
both iks_child() and iks_next() will return the next iks_struct node
*regardless* of type. chan_gtalk expects the next node to be of type IKS_TAG,
which in most cases, it is, but in this case (a call being made from the
Empathy IM client), there exists iks_struct nodes which are not IKS_TAG data
(they are extraneous whitespaces), and chan_gtalk doesn't handle that case,
so capabilities don't match, and a call cannot be made.
iks_first_tag() and iks_next_tag(), on the other hand, will not return the
very next iks_struct, but will check to see if the next iks_struct is of
type IKS_TAG. If it isn't, it will be skipped, and the next struct of type
IKS_TAG it finds will be returned. This assures that chan_gtalk will find
the iks_struct it is looking for.
This fix simply changes all calls to iks_child() and iks_next() to become
calls to iks_first_tag() and iks_next_tag(), which resolves the capability
matching.
The following is a payload listing from Empathy, which, due to the extraneous
whitespace, will not be parsed correctly by iksemel:
Fix some state_interface stuff that was in trunk but not in the backport to 1.4.
Issue #14359 was fixed between the time that I posted the review of the backport
of the state interface change for 1.4. This merges the changes from that issue
back into 1.4.
Make chan_misdn BRI TE side normally defer channel selection to the NT side.
Channel allocation collisions are not handled by chan_misdn very well.
This patch simply avoids the problem for BRI only.
For PRI, allocation collisions are still possible but less likely since
there are simply more channels available and each end could use a different
allocation strategy.
misdn.conf options available:
te_choose_channel - Use to force the TE side to allocate channels.
method - Specify the channel allocation strategy.
Fix queue weight behavior so that calls in low-weight queues are not inappropriately blocked.
(This is copied and pasted from the review request I made for this patch)
Asterisk has some odd behavior when queue weights are used. The current logic used when
potentially calling a queue member is:
If the member we are going to call is part of another queue and _that other queue has any
callers in it_ and has a higher weight than the queue we are calling from, then don't try
to contact that member. The issue here is what I have marked with underscores. If the
higher-weighted queue has any callers in it at all, then the queue member will be unreachable
from the lower-weighted queue. This has the potential to be really really bad if using a
queue strategy, such as leastrecent or fewestcalls, with the potential to call the same
member repeatedly.
The fix proposed by garychen on issue 13220 is very simple and, as far as I can see, works
well for this situation. With this set of changes, the logic used becomes:
If the member we are going to call is part of another queue, the other queue has a higher
weight than the queue we are calling from, and the higher weight queue has at least as many
callers as available members, then do not try to contact the queue member. If the higher
weighted queue has fewer callers than available members, then there is no reason to deny
the call to this member since the other queue can afford to spare a member.
Since the fix involved writing a generic function for determining the number of available
members in the queue, I also modified the is_our_turn function to make use of the new
num_available_members function to determine if it is our turn to try calling a member. There
is one small behavior change. Before writing this patch, if you had autofill disabled, then
if you were the head caller in a queue, you would automatically be told that it was your
turn to try calling a member. This did not take into account whether there were actually any
queue members available to take the call. Now we actually make sure there is at least one
member available to take the call if autofill is disabled.
Backport state interface changes to app_queue from trunk.
After several issues raised on the Asterisk bugtracker against
the 1.4 branch were determined to be fixable with the state interface
change available in the 1.6.X series, it finally came time to just
suck it up and backport the change.
For a detailed explanation of what this change entails, the original
trunk commit for this feature may be found here:
In addition, the details for the use of this change to fix the problems
stated in issue #12970 may be found in the review request I made for
this change. It is linked below.
Improve our handling of T38 in the initial INVITE from a device.
We now answer with matching media streams to what is requested. If an INVITE
is received with both a T38 and RTP media stream this means we answer with both.
For any outgoing calls created as a result of this inbound one no T38 is requested
in the initial INVITE. Instead if we start receiving udptl packets we trigger a
reinvite on the outbound side.
Leif Madsen [Fri, 27 Mar 2009 19:31:04 +0000 (19:31 +0000)]
Update commit message guidelines in re: to punctuation.
The doxygen documentation has now been updated to state explicitly that I want
punctuation atthe end of the first sentence in a commit message. :).
Kevin P. Fleming [Fri, 27 Mar 2009 19:10:32 +0000 (19:10 +0000)]
Improve timing interface to remember which provider provided a timer
The ability to load/unload timing interfaces is nice, but it means that when a timer is allocated, it may come from provider A, but later provider B becomes the 'preferred' provider. If this happens, all timer API calls on the timer that was provided by provider A will actually be handed to provider B, which will say WTF and return an error.
This patch changes the timer API to include a pointer to the provider of the timer handle so that future operations on the timer will be forwarded to the proper provider.
Joshua Colp [Fri, 27 Mar 2009 15:57:28 +0000 (15:57 +0000)]
Fix a potential timer leak in bridge_softmix.
It is possible for a bridge to be created without actually being used.
In that scenario a timing file descriptor would be opened and not
closed. To fix this the timing file descriptor is now closed in the
destroy callback, not the thread function.
Joshua Colp [Fri, 27 Mar 2009 15:46:46 +0000 (15:46 +0000)]
Fix speech structure leak in the AGI speech recognition integration.
The AGI dialplan applications did not destroy the speech structure automatically
if it was not destroyed by the running AGI script. They will now do this.
Joshua Colp [Fri, 27 Mar 2009 13:57:29 +0000 (13:57 +0000)]
Fix a potential race condition when creating a software based mixing bridge.
It was possible for no timer to become available between creating the bridge
and starting it. We now open a timer when creating it and keep it open until the
bridge is destroyed.
Fix an issue where nat=yes would not always take effect for the RTP session on outgoing calls.
If calls were placed using an IP address or hostname the global nat setting was copied over
but was not set on the RTP session itself. This caused the RTP stack to not perform symmetric RTP
actions.
Russell Bryant [Fri, 27 Mar 2009 02:20:23 +0000 (02:20 +0000)]
Fix some issues with rwlock corruption that caused deadlock like symptoms.
When dvossel and I were doing some load testing last week, we noticed that we
could make Asterisk trunk lock up instantly when we started generating a bunch
of calls. The backtraces of locked threads were bizarre, and many were stuck
on an _unlock_ of an rwlock.
The changes are:
1) Fix a number of places where a backtrace would be loaded into an invalid
index of the backtrace array. It's an off by one error, which ends up
writing over the rwlock itself.
2) Ensure that in the array of held locks, we NULL out an index once it is
not being used so that it's not confusing when analyzing its contents.
3) Remove a bunch of logging referring to an rwlock operating being done
with "deep reentrancy". It is normal for _many_ threads to hold a
read lock on an rwlock.