David Vossel [Fri, 3 Apr 2009 16:29:47 +0000 (16:29 +0000)]
audio_audiohook_write_list() did not correctly update sample size after ast_translate.
audio_audiohook_write_list() did not take into account that the sample size may change after translation depending on if the original frame is is 8khz or 16khz. the sample size is now updated after translating to reflect this possibility. This caused the audio on the receiving end to sound terrible. Thanks to jcolp and mmichelson for helping me work this out.
Mark Michelson [Fri, 3 Apr 2009 14:32:05 +0000 (14:32 +0000)]
Fix the ability to retrieve voicemail messages from IMAP.
A recent change made interactive vm_states no longer get
added to the list of vm_states and instead get stored in
thread-local storage.
In trunk and all the 1.6.X branches, the problem is that
when we search for messages in a voicemail box, we would
attempt to update the appropriate vm_state struct by directly
searching in the list of vm_states instead of using the
get_vm_state_by_imap_user function. This meant we could not
find the interactive vm_state that we wanted.
I came across this while doing some testing of my ast_channel_ao2 branch.
After running a test overnight that generated over 5 million calls, Asterisk
had taken up about 1 GB of my system memory. So, I re-ran the test with
MALLOC_DEBUG turned on. However, it showed no leaks in Asterisk during the
test, even though Asterisk was still consuming it somehow.
Instead, I turned to valgrind, which when run with --leak-check=full, told
me exactly where the leak came from, which was from allocations inside the
radiusclient-ng library. This explains why MALLOC_DEBUG did not report it.
After a bit of analysis, I found that we were leaking a little bit of memory
every time a CDR record was passed to cdr_radius.
I don't actually have a radius server set up to receive CDR records. However,
I always have my development systems compile and install all modules. In
addition to making sure there are not build errors across modules, always
loading modules helps find bugs like this, too, so it is strongly recommend for
all developers.
This API provides a generic way for multiple RTP stacks to be
integrated into Asterisk. Right now there is only one present, res_rtp_asterisk,
which is the existing Asterisk RTP stack. Functionality wise this commit
performs the same as previously. API documentation can be viewed in the
rtp_engine.h header file.
Missed a common case for needing to extend the buffer.
(closes issue #14716)
Reported by: sum
Patches:
20090402__bug14716.diff.txt uploaded by tilghman (license 14)
Tested by: sum
the DAHDI_GETCONF, DAHDI_SETCONF and DAHDI_GET_PARAMS ioctls were recently corrected to show that they do, in fact, read data from userspace as part of their work. due to this fix, valgrind now reports a number of cases where chan_dahdi passed an uninitialized (or partially) buffer to these ioctls, which could lead to unexpected behavior.
this patch corrects chan_dahdi to ensure that buffers passed to these ioctls are always fully initialized.
........
Merge changes from str_substitution that are unrelated to that branch.
Included is a small bugfix to an ast_str helper, but most of these changes
are simply doxygen fixes.
Fixes issue with dropped calles due to re-Invite glare and re-Invites never executing after a 491
Acknowledgement for 491 responses were never being processed because it didn't match our pending invite's seqno. Since the ACK was never processed, the 491 frame would continue to be retransmitted until eventually the call was dropped due to max retries. Now during a pending invite, if we receive another invite, we send an 491 and hold on to that glare invite's seqno in the "glareinvite" variable for that sip_pvt struct. When ACK's are received, we first check to see if it is in response to our pending invite, if not we check to see if it is in response to a glare invite. In this case, it is in response to the glare invite and must be dealt with or the call is dropped. I've changed the wait time for resending the re-Invite after receving a 491 response to comply with RFC 3261. Before this patch the scheduled re-Invite would only change a flag indicating that the re-Invite should be sent out, now it actually sends it out as well.
This change fixes a situation where an audiohook that wants DTMF would not
actually get it. This is in the code path where we end DTMF digit length
emulation while handling a NULL frame.
Kevin P. Fleming [Tue, 31 Mar 2009 21:29:50 +0000 (21:29 +0000)]
Optimizations to the stringfields API
This patch provides a number of optimizations to the stringfields API, focused around saving (not wasting) memory whenever possible. Thanks to Mark Michelson for inspiring this work and coming up with the first two optimizations that are represented here:
Changes:
- Cleanup of some code, fix incorrect doxygen comments
- When a field is emptied or replaced with a new allocation, decrease the amount of 'active' space in the pool it was held in; if that pool reaches zero active space, and is not the current pool, then free it as it is no longer in use
- When allocating a pool, try to allocate a size that will fit in a 'standard' malloc() allocation without wasting space
- When allocating space for a field, store the amount of space in the two bytes immediately preceding the field; this eliminates the need to call strlen() on the field when overwriting it, and more importantly it 'remembers' the amount of space the field has available, even if a shorter string has been stored in it since it was allocated
- Don't automatically double the size of each successive pool allocated; it's wasteful
Russell Bryant [Tue, 31 Mar 2009 19:07:58 +0000 (19:07 +0000)]
Improve performance of the code handling the frame queue in chan_iax2.
In my tests that exercised full frame handling in chan_iax2, the version with
these changes took 30% to 40% of the CPU time compared to the same test of
Asterisk trunk before these modifications.
While doing some profiling for <http://reviewboard.digium.com/r/205/>,
one function that caught my eye was network_thread() in chan_iax2.c.
After the things that I was working on there, it was the next target
for analysis and optimization. I used oprofile's source annotation
functionality and found that the loop traversing the frame queue in
network_thread() was to blame for the excessive CPU cycle consumption.
The frame_queue in chan_iax2 previously held all frames that either were
pending transmission or had been transmitted and are still pending
acknowledgment.
In network_thread(), the previous code would go back through the main
for loop after reading a single incoming frame or after being signaled
because a frame had been queued up for initial transmission. In each
iteration of the loop, it traverses the entire frame queue looking for
frames that need to be transmitted. On a busy server, this could easily
be quite a few entries.
This patch is actually quite simple. The frame_queue has become only a list
of frames pending acknowledgment. Frames that need to be transmitted are
queued up to a dedicated transmit thread via the taskprocessor API.
As a result, the code in network_thread() becomes much simpler, as its only
job is to read incoming frames.
In addition to the previously described changes, this patch includes some
additional changes to the frame_queue. Instead of one big frame_queue, now
there is a list per call number to further reduce wasted list traversals.
The biggest impact of this change is in socket_process().
For additional details on testing and test results, see the review request.
Fix incorrect parsing in chan_gtalk when xmpp contains extra whitespaces
To drill into the xmpp to find the capabilities between channels, chan_gtalk
calls iks_child() and iks_next(). iks_child() and iks_next() are functions in
the iksemel xml parsing library that traverse xml nodes. The bug here is that
both iks_child() and iks_next() will return the next iks_struct node
*regardless* of type. chan_gtalk expects the next node to be of type IKS_TAG,
which in most cases, it is, but in this case (a call being made from the
Empathy IM client), there exists iks_struct nodes which are not IKS_TAG data
(they are extraneous whitespaces), and chan_gtalk doesn't handle that case,
so capabilities don't match, and a call cannot be made.
iks_first_tag() and iks_next_tag(), on the other hand, will not return the
very next iks_struct, but will check to see if the next iks_struct is of
type IKS_TAG. If it isn't, it will be skipped, and the next struct of type
IKS_TAG it finds will be returned. This assures that chan_gtalk will find
the iks_struct it is looking for.
This fix simply changes all calls to iks_child() and iks_next() to become
calls to iks_first_tag() and iks_next_tag(), which resolves the capability
matching.
The following is a payload listing from Empathy, which, due to the extraneous
whitespace, will not be parsed correctly by iksemel:
Fix some state_interface stuff that was in trunk but not in the backport to 1.4.
Issue #14359 was fixed between the time that I posted the review of the backport
of the state interface change for 1.4. This merges the changes from that issue
back into 1.4.
Make chan_misdn BRI TE side normally defer channel selection to the NT side.
Channel allocation collisions are not handled by chan_misdn very well.
This patch simply avoids the problem for BRI only.
For PRI, allocation collisions are still possible but less likely since
there are simply more channels available and each end could use a different
allocation strategy.
misdn.conf options available:
te_choose_channel - Use to force the TE side to allocate channels.
method - Specify the channel allocation strategy.
Fix queue weight behavior so that calls in low-weight queues are not inappropriately blocked.
(This is copied and pasted from the review request I made for this patch)
Asterisk has some odd behavior when queue weights are used. The current logic used when
potentially calling a queue member is:
If the member we are going to call is part of another queue and _that other queue has any
callers in it_ and has a higher weight than the queue we are calling from, then don't try
to contact that member. The issue here is what I have marked with underscores. If the
higher-weighted queue has any callers in it at all, then the queue member will be unreachable
from the lower-weighted queue. This has the potential to be really really bad if using a
queue strategy, such as leastrecent or fewestcalls, with the potential to call the same
member repeatedly.
The fix proposed by garychen on issue 13220 is very simple and, as far as I can see, works
well for this situation. With this set of changes, the logic used becomes:
If the member we are going to call is part of another queue, the other queue has a higher
weight than the queue we are calling from, and the higher weight queue has at least as many
callers as available members, then do not try to contact the queue member. If the higher
weighted queue has fewer callers than available members, then there is no reason to deny
the call to this member since the other queue can afford to spare a member.
Since the fix involved writing a generic function for determining the number of available
members in the queue, I also modified the is_our_turn function to make use of the new
num_available_members function to determine if it is our turn to try calling a member. There
is one small behavior change. Before writing this patch, if you had autofill disabled, then
if you were the head caller in a queue, you would automatically be told that it was your
turn to try calling a member. This did not take into account whether there were actually any
queue members available to take the call. Now we actually make sure there is at least one
member available to take the call if autofill is disabled.
Backport state interface changes to app_queue from trunk.
After several issues raised on the Asterisk bugtracker against
the 1.4 branch were determined to be fixable with the state interface
change available in the 1.6.X series, it finally came time to just
suck it up and backport the change.
For a detailed explanation of what this change entails, the original
trunk commit for this feature may be found here:
In addition, the details for the use of this change to fix the problems
stated in issue #12970 may be found in the review request I made for
this change. It is linked below.
Improve our handling of T38 in the initial INVITE from a device.
We now answer with matching media streams to what is requested. If an INVITE
is received with both a T38 and RTP media stream this means we answer with both.
For any outgoing calls created as a result of this inbound one no T38 is requested
in the initial INVITE. Instead if we start receiving udptl packets we trigger a
reinvite on the outbound side.
Leif Madsen [Fri, 27 Mar 2009 19:31:04 +0000 (19:31 +0000)]
Update commit message guidelines in re: to punctuation.
The doxygen documentation has now been updated to state explicitly that I want
punctuation atthe end of the first sentence in a commit message. :).
Kevin P. Fleming [Fri, 27 Mar 2009 19:10:32 +0000 (19:10 +0000)]
Improve timing interface to remember which provider provided a timer
The ability to load/unload timing interfaces is nice, but it means that when a timer is allocated, it may come from provider A, but later provider B becomes the 'preferred' provider. If this happens, all timer API calls on the timer that was provided by provider A will actually be handed to provider B, which will say WTF and return an error.
This patch changes the timer API to include a pointer to the provider of the timer handle so that future operations on the timer will be forwarded to the proper provider.
Joshua Colp [Fri, 27 Mar 2009 15:57:28 +0000 (15:57 +0000)]
Fix a potential timer leak in bridge_softmix.
It is possible for a bridge to be created without actually being used.
In that scenario a timing file descriptor would be opened and not
closed. To fix this the timing file descriptor is now closed in the
destroy callback, not the thread function.
Joshua Colp [Fri, 27 Mar 2009 15:46:46 +0000 (15:46 +0000)]
Fix speech structure leak in the AGI speech recognition integration.
The AGI dialplan applications did not destroy the speech structure automatically
if it was not destroyed by the running AGI script. They will now do this.
Joshua Colp [Fri, 27 Mar 2009 13:57:29 +0000 (13:57 +0000)]
Fix a potential race condition when creating a software based mixing bridge.
It was possible for no timer to become available between creating the bridge
and starting it. We now open a timer when creating it and keep it open until the
bridge is destroyed.
Fix an issue where nat=yes would not always take effect for the RTP session on outgoing calls.
If calls were placed using an IP address or hostname the global nat setting was copied over
but was not set on the RTP session itself. This caused the RTP stack to not perform symmetric RTP
actions.
Russell Bryant [Fri, 27 Mar 2009 02:20:23 +0000 (02:20 +0000)]
Fix some issues with rwlock corruption that caused deadlock like symptoms.
When dvossel and I were doing some load testing last week, we noticed that we
could make Asterisk trunk lock up instantly when we started generating a bunch
of calls. The backtraces of locked threads were bizarre, and many were stuck
on an _unlock_ of an rwlock.
The changes are:
1) Fix a number of places where a backtrace would be loaded into an invalid
index of the backtrace array. It's an off by one error, which ends up
writing over the rwlock itself.
2) Ensure that in the array of held locks, we NULL out an index once it is
not being used so that it's not confusing when analyzing its contents.
3) Remove a bunch of logging referring to an rwlock operating being done
with "deep reentrancy". It is normal for _many_ threads to hold a
read lock on an rwlock.
pri loop TestClient/TestServer fails: server SEND DTMF 8
app_test was failing when sending the last DTMF digit, 8, because of the 100ms pause issued after DTMF is sent. During this pause the other side would hang up causing the test to look like it failed. Now the other side waits a second before hanging up.
Russell Bryant [Wed, 25 Mar 2009 21:57:19 +0000 (21:57 +0000)]
Improve performance of the ast_event cache functionality.
This code comes from svn/asterisk/team/russell/event_performance/.
Here is a summary of the changes that have been made, in order of both
invasiveness and performance impact, from smallest to largest.
1) Asterisk 1.6.1 introduces some additional logic to be able to handle
distributed device state. This functionality comes at a cost.
One relatively minor change in this patch is that the extra processing
required for distributed device state is now completely bypassed if
it's not needed.
2) One of the things that I noticed when profiling this code was that a
_lot_ of time was spent doing string comparisons. I changed the way
strings are represented in an event to include a hash value at the front.
So, before doing a string comparison, we do an integer comparison on the
hash.
3) Finally, the code that handles the event cache has been re-written.
I tried to do this in a such a way that it had minimal impact on the API.
I did have to change one API call, though - ast_event_queue_and_cache().
However, the way it works now is nicer, IMO. Each type of event that
can be cached (MWI, device state) has its own hash table and rules for
hashing and comparing objects. This by far made the biggest impact on
performance.
For additional details regarding this code and how it was tested, please see the
review request.
Avoid destroying the CLI line when moving the cursor backward and trying to autocomplete.
When moving the cursor backward and pressing TAB to autocomplete, a NULL is put
in the line and we are loosing what we have already wrote after the actual
cursor position.
Russell Bryant [Tue, 24 Mar 2009 21:40:44 +0000 (21:40 +0000)]
Exclude slin16, siren7, and siren14 from bandwidth=low and =medium
The default codec configuration for chan_iax2 is bandwidth=low. I noticed
slin16 being negotiated as the codec in some test calls, but that no longer
happens after this change.
David Vossel [Tue, 24 Mar 2009 20:01:29 +0000 (20:01 +0000)]
SIP preferred codec only feature
Added an option to respond to a SIP invite with only the single most preferred joint codec. This limits the options of what codecs the other side can use.
The only way that this leak would occur is if Monitor were started
using the Manager interface and no File: header were given. Discovered
while reviewing the ast_channel_ao2 review request.
........
Cleaning up a few things in detect disconnect patch
Initialized ast_call_feature in detect_disconnect to avoid accessing uninitialized memory. Cleaned up /param tags in features.h. No longer send dynamic features in ast_feature_detect.
Mark Michelson [Thu, 19 Mar 2009 18:10:34 +0000 (18:10 +0000)]
Fix a memory leak associated with queues.
For every attempt that app_queue made to place an outbound call to a queue member,
we would allocate a queue_end_bridge structure. When the bridge for the call had
completed, we would free the structure. Unfortunately not all call attempts actually
end up bridged to a member, so we need to be more selective of when to allocate
the structure. With this change, the allocation occurs in an area where we can
guarantee that the call will be bridged.
feature.conf has a disconnect option. By default this option is set to '*', but it could be anything. If a user wishes to disconnect a call before the other side answers, only '*' will work, regardless if the disconnect option is set to something else. This is because features are unavailable until bridging takes place. The default disconnect option, '*', was hardcoded in app_dial, which doesn't make any sense from a user perspective since they may expect it to be something different. This patch allows features to be detected from outside of the bridge, but not operated on. In this case, the disconnect feature can be detected before briding and handled outside of features.c.
Fix an issue where cancelled outgoing SIP calls would erroneously report the device as "in use."
A user was having an issue where if an outgoing SIP call was canceled, the SIP device
would remain in use if we had not received any response to the initial INVITE we sent out.
The SIP device would remain in use until the autocongestion timer was exhausted.
I tracked down the cause of this to be the section of code I am removing here. I asked several
people what the purpose of this code was meant to be, but no one could give me any sort of
answer as to why this was here. The person who was having this issue has been using this patch
for several months and it has stopped the problems they have had.
Joshua Colp [Thu, 19 Mar 2009 15:37:23 +0000 (15:37 +0000)]
Improve our triggering of a T38 switchover internally when triggered by a received reinvite.
Previously we reached across the channel bridge to get the other party's SIP dialog
structure in order to trigger an outgoing reinvite. This is extremely dangerous to do
and only works if bridged to another SIP channel. This patch changes this to use the
T38 control frame method of requesting a switchover. This change also causes the SIP
channel driver to propogate back whether the switchover worked or not instead of blindly
accepting the incoming T38 reinvite.
Joshua Colp [Wed, 18 Mar 2009 22:22:56 +0000 (22:22 +0000)]
Fix an issue where a T38 control frame would get dropped.
If two channels were bridged together using a generic bridge the T38
control frame would get passed up instead of being indicated on the
other channel.
Allow H.323 Plus library to be used in addition to the OpenH323 library
Chan_h323 can now be compiled against both the previously supported versions of
OpenH323 as well as the current H.323 Plus (version 1.20.2). The configure
script has been modified to look in the default install location of h323 to
hopefully help avoid using the environment variables OPENH323DIR and PWLIBDIR.
Also, the CLI command "h323 show version" has been added which indicates which
version of h323 is in use.
Russell Bryant [Wed, 18 Mar 2009 02:28:55 +0000 (02:28 +0000)]
Merged revisions 182810 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4
........
r182810 | russell | 2009-03-17 21:09:13 -0500 (Tue, 17 Mar 2009) | 44 lines
Fix cases where the internal poll() was not being used when it needed to be.
We have seen a number of problems caused by poll() not working properly on
Mac OSX. If you search around, you'll find a number of references to using
select() instead of poll() to work around these issues. In Asterisk, we've
had poll.c which implements poll() using select() internally. However, we
were still getting reports of problems.
vadim investigated a bit and realized that at least on his system, even
though we were compiling in poll.o, the system poll() was still being used.
So, the primary purpose of this patch is to ensure that we're using the
internal poll() when we want it to be used.
The changes are:
1) Remove logic for when internal poll should be used from the Makefile.
Instead, put it in the configure script. The logic in the configure
script is the same as it was in the Makefile. Ideally, we would have
a functionality test for the problem, but that's not actually possible,
since we would have to be able to run an application on the _target_
system to test poll() behavior.
2) Always include poll.o in the build, but it will be empty if AST_POLL_COMPAT
is not defined.
3) Change uses of poll() throughout the source tree to ast_poll(). I feel
that it is good practice to give the API call a new name when we are
changing its behavior and not using the system version directly in all cases.
So, normally, ast_poll() is just redefined to poll(). On systems where
AST_POLL_COMPAT is defined, ast_poll() is redefined to ast_internal_poll().
4) Change poll() in main/poll.c to be ast_internal_poll().
It's worth noting that any code that still uses poll() directly will work fine
(if they worked fine before). So, for example, out of tree modules that are
using poll() will not stop working or anything. However, for modules to work
properly on Mac OSX, ast_poll() needs to be used.
Improve the build system to *properly* remove unnecessary symbols from the runtime global namespace. Along the way, change the prefixes on some internal-only API calls to use a common prefix.
With these changes, for a module to export symbols into the global namespace, it must have *both* the AST_MODFLAG_GLOBAL_SYMBOLS flag and a linker script that allows the linker to leave the symbols exposed in the module's .so file (see res_odbc.exports for an example).
........
Jeff Peeler [Tue, 17 Mar 2009 20:47:31 +0000 (20:47 +0000)]
Allow H.323 Plus library to be used in addition to the OpenH323 library
Chan_h323 can now be compiled against both the previously supported versions of
OpenH323 as well as the current H.323 Plus (version 1.20.2). The configure
script has been modified to look in the default install location of h323 to
hopefully help avoid using the environment variables OPENH323DIR and PWLIBDIR.
Also, the CLI command "h323 show version" has been added which indicates which
version of h323 is in use.