Alec L Davis [Fri, 1 Apr 2011 08:29:49 +0000 (08:29 +0000)]
voicemail: get real last_message_index and count_messages, ODBC resequence
change last_message_index to read the max msgnum stored in the database
change count_messages to actually count the number of messages.
last_message_index change:
This fixed overwriting of the last message if msgnum=0 was missing.
Previously every incoming message would overwrite msgnum=1.
count_messages change:
allows us to detect when requencing is required in opneA_mailbox.
resequence enabled for ODBC storage:
Assists with fixing up corrupt databases with gaps, but only when
a user actively opens there mailboxes.
(closes issue #18692,#18582,#19032)
Reported by: elguero
Patches:
based on odbc_resequence_mailbox2.1.diff uploaded by elguero (license 37)
Tested by: elguero, nivek, alecdavis
Alec L Davis [Fri, 1 Apr 2011 06:46:56 +0000 (06:46 +0000)]
app_voicemail: close_mailbox needs to respect additional messages while mailbox is open.
close_mailbox leave gaps in message sequence if messages are deleted and new messages
arrive during this time, this is because the shuffle down to slot 0, only shuffles
the number of pre-existing messages when mailbox is opened, ignoring new arrivals.
Fix: in close_mailbox re-evaluate number of messages before the shuffle, this then includes new arrivals.
Terry Wilson [Wed, 16 Mar 2011 16:58:42 +0000 (16:58 +0000)]
Don't delay DTMF in core bridge while listening for DTMF features
This patch is mostly the work of Olle Johansson. I did some cleanup and
added the silence generating code if transmit_silence is set.
When a channel listens for DTMF in the core bridge, the outbound DTMF is not
sent until we have received DTMF_END. For a long DTMF, this is a disaster. We
send 4 seconds of DTMF to Asterisk, which sends no audio for those 4 seconds.
Some products see this delay and the time skew on RTP packets that results and
start ignoring the audio that is sent afterward.
With this change, the DTMF_BEGIN frame is inspected and checked. If it matches
a feature code, we wait for DTMF_END and activate the feature as before. If
transmit_silence=yes in asterisk.conf, silence is sent if we paritally match a
multi-digit feature. If it doesn't match a feature, the frame is forwarded
along with the DTMF_END without delay. By doing it this way, DTMF is not delayed.
(closes issue #15642)
Reported by: jasonshugart
Patches:
issue_15652_dtmf_ast-1.4.patch.txt uploaded by twilson (license 396)
Tested by: globalnetinc, jde
Richard Mudgett [Mon, 14 Mar 2011 16:38:24 +0000 (16:38 +0000)]
"Caller*ID failed checksum" on Wildcard TDM2400P and TDM410
The last character in the caller id message is getting a framing error.
The checksum is the last character in the message. A framing error in the
checksum could be because:
1) The sender did not send a full stop bit.
2) The sender cut off the FSK carrier too soon.
3) The sender opted to send zero of the specified zero to 10 trailing mark
bits and round-off errors in the code resulted in the code not being where
it thought it was in the demodulated bit stream.
Bit 8 of 'b' is set when parity error.
Bit 9 of 'b' is set when framing error.
Made ignore the framing and parity error bits if the errored character is
the checksum. We can tolerate a framing/parity error there. The checksum
character validates the message.
Tilghman Lesher [Sat, 12 Mar 2011 20:22:07 +0000 (20:22 +0000)]
Add AELSub, which provides a stable entry point into AEL subroutines.
This commit needs some explanation, given that we're adding a new application
into an existing release branch. This is generally a violation of our release
policy, except in very limited circumstances, and I believe this is one of
those circumstances.
The problem that this solves is one of the sanity of using multiple dialplan
languages to define a dialplan. In the case of the reporter, he or she is
using AEL is define subroutines, while using Realtime extensions to invoke
those subroutines. While you can do this, it's based upon the reality of AEL
using actual dialplan extensions; however, there is no guarantee that the
details of _how_ AEL is compiled into extensions will remain stable. In fact,
at the time of this commit, it has already changed twice, once in a
fundamental way.
Now normally, a new application would only be added to trunk. However, this
application is explicitly to create a stable user-level API between versions,
and adding it to trunk only will not solve the user's problem of switching
between 1.6.2 and 1.8, nor will it help anybody switching from 1.8 to 1.10.
Therefore, it needs to go into existing release branches. For the sake of
consistency, and also because one of the changes was between 1.4 and 1.6.x,
I am also electing to commit this to 1.4.
Terry Wilson [Thu, 24 Feb 2011 17:42:16 +0000 (17:42 +0000)]
Don't broadcast FullyBooted to every AMI connection
The FullyBooted event should not be sent to every AMI connection every
time someone connects via AMI. It should only be sent to the user who
just connected.
Properly check the bounds of arrays when decoding UDPTL packets. Also, remove broken support for receiving UDPTL packets larger than 16k. That shouldn't ever happen anyway.
Jason Parker [Tue, 15 Feb 2011 23:32:20 +0000 (23:32 +0000)]
Fix regression that changed behavior of queues when ringing a queue member.
This reverts r298596, which was to fix a highly bizarre and contrived issue
with a queue member that called into his own queue being transferred back
into his own queue. I couldn't reproduce that issue in any way. I think one
of the other recent transfer fixes actually fixed this.
Richard Mudgett [Fri, 11 Feb 2011 00:29:17 +0000 (00:29 +0000)]
Reentrancy problem if outgoing call gets different B channel than requested.
The chan_dahdi pri_fixup_principle() routine needs to protect the Asterisk
channel with the channel lock when it changes the technology private
pointer to a new private structure.
* Added lock protection while pri_fixup_principle() moves a call from one
private structure to another.
* Made some pri_fixup_principle() messages more meaningful.
Terry Wilson [Mon, 7 Feb 2011 22:35:20 +0000 (22:35 +0000)]
Don't try to pickup a call in the middle of a masquerade
If A calls B which doesn't answer and C & D both try to do a call pickup, it is
possible for ast_pickup_call to answer the call, then fail to masquerade one of
the calls because the other one is already in the process of masquerading. This
patch checks to see if the channel is in the process of masquerading before
call before selecting it for a pickup.
Terry Wilson [Mon, 7 Feb 2011 21:51:43 +0000 (21:51 +0000)]
Don't allow a REFER w/replaces to replace its own dialog
Asterisk currently accepts a REFER with a Refer-To with an embedded Replaces
header that matches the dialog of the REFER. This would be a situation like A
calls B, A calls C, A transfers B to A, which is just silly. This patch makes
the transfer fail instead of making Asterisk freak out and forget to hang other
channels up.
Terry Wilson [Thu, 3 Feb 2011 20:36:34 +0000 (20:36 +0000)]
Set hangup cause in local_hangup
When a call involves a local channel (like SIP -> Local -> SIP), the hangup
cause was not being set. This resulted in SIP channels sometimes getting a
503 error instead of a 486 when the far side sent a busy. In Asterisk 1.8+
this also can cause issues with CCSS that involve a local channel. This patch
sets the hangupcause for one side of the local channel to the other in
local_hangup for outbound calls.
Richard Mudgett [Thu, 3 Feb 2011 00:02:43 +0000 (00:02 +0000)]
Minor AST_FRAME_TEXT related issues.
* Include the null terminator in the buffer length. When the frame is
queued it is copied. If the null terminator is not part of the frame
buffer length, the receiver could see garbage appended onto it.
* Add channel lock protection with ast_sendtext().
Jason Parker [Mon, 31 Jan 2011 22:56:54 +0000 (22:56 +0000)]
Prevent a crash when dialing a technology with no destination (ex: Dial(SIP/))
chan_iax2 and other channel drivers already had code to prevent this. The
attempt that app_dial was making to prevent it was not correct, so I fixed that.
Jason Parker [Thu, 27 Jan 2011 16:57:46 +0000 (16:57 +0000)]
Fix default prefix=/usr regression on non-Linux systems.
This partially reverts a change made in branches/1.4/ r267759, which will
cause issue #17013 to be reopened. This issue was pointed out by a user
on #asterisk, who helpfully discovered that paths were being set incorrectly.
To truly understand what was wrong, one should run:
svn diff --force -c<this revision> configure
This patch modifies chan_sip to route responses to the address the request came from. It also modifies chan_sip to respect the maddr parameter in the Via header.
Richard Mudgett [Tue, 25 Jan 2011 23:21:09 +0000 (23:21 +0000)]
DTMF attended transfers sometimes fail for no apparent reason.
The loop in feature_request_and_dial() can exit when Party C has answered
without processing an AST_CONTROL_ANSWER. Also sometimes an
AST_CONTROL_ANSWER never happens even though Party C has answered.
Don't hangup Party C if he is up or we receive an AST_CONTROL_ANSWER.
Terry Wilson [Tue, 25 Jan 2011 20:50:59 +0000 (20:50 +0000)]
Guard against retransmitting BYEs indefinitely
In the case of an attended transfer (A calls B, A atxfers to C) where
A becomes unreachable before replying to Asterisk's BYE, Asterisk can
sometimes retransmit the BYE indefinitely. This is because
__sip_autodestruct tests p->refer && !ast_test_flag(&p->flags[0],
SIP_ALREADYGONE and will then transmit a BYE. When this BYE times out,
it will not ever be marked as ALREADYGONE, so when __sip_autodestruct
is called again, we end up starting the cycle over.
This patch adds a call to sip_alreadygone(pkt->owner) in retrans_pkt
in the case of a BYE that has timed out. This should prevent Asterisk
from trying to transmit new BYE messages in the future.
Richard Mudgett [Tue, 25 Jan 2011 17:36:50 +0000 (17:36 +0000)]
Sending out unnecessary PROCEEDING messages breaks overlap dialing.
Issue #16789 was a good idea. Unfortunately, it breaks overlap dialing
through Asterisk. There is not enough information available at this point
to know if dialing is complete. The ast_exists_extension(),
ast_matchmore_extension(), and ast_canmatch_extension() calls are not
adequate to detect a dial through extension pattern of "_9!".
Workaround is to use the dialplan Proceeding() application early in
non-dial through extensions.
* Effectively revert issue #16789.
* Allow outgoing overlap dialing to hear dialtone and other early media.
A PROGRESS "inband-information is now available" message is now sent after
the SETUP_ACKNOWLEDGE message for non-digital calls. An
AST_CONTROL_PROGRESS is now generated for incoming SETUP_ACKNOWLEDGE
messages for non-digital calls.
* Handling of the AST_CONTROL_CONGESTION in chan_dahdi/sig_pri was
inconsistent with the cause codes.
* Added better protection from sending out of sequence messages by
combining several flags into a single enum value representing call
progress level.
(closes issue #18509)
Reported by: wimpy
Patches:
issue18509_early_media_v1.8_v3.patch uploaded by rmudgett (license 664)
Expanded upon issue18509_early_media_v1.8_v3.patch to include analog
and SS7 because of backporting requirements.
Tested by: wimpy, rmudgett
Jeff Peeler [Tue, 25 Jan 2011 16:58:29 +0000 (16:58 +0000)]
Fix voicemail sequencing for file based storage.
A previous change was made to account for when the number of voicemail messages
exceeds the max limit to be handled properly, but it caused gaps in the messages
to not be properly handled. This has now been resolved.
In later non 1.4 branches, it appears that resequencing wasn't even occurring
due from what appears and accidental code removal.
Russell Bryant [Mon, 24 Jan 2011 20:32:21 +0000 (20:32 +0000)]
Fix channel redirect out of MeetMe() and other issues with channel softhangup.
Mantis issue #18585 reports that a channel redirect out of MeetMe() stopped
working properly. This issue includes a patch that resolves the issue by
removing a call to ast_check_hangup() from app_meetme.c. I left that in my
patch, as it doesn't need to be there. However, the rest of the patch fixes
this problem with or without the change to app_meetme.
The key difference between what happens before and after this patch is the
effect of the END_OF_Q control frame. After END_OF_Q is hit in ast_read(),
ast_read() will return NULL. With the ast_check_hangup() removed, app_meetme
sees this which causes it to exit as intended. Checking ast_check_hangup()
caused app_meetme to exit earlier in the process, and the target of the
redirect saw the condition where ast_read() returned NULL.
Removing ast_check_hangup() works around the issue in app_meetme, but doesn't
solve the issue if another application did the same thing. There are also
other edge cases where if an application finishes at the same time that a
redirect happens, the target of the redirect will think that the channel hung
up. So, I made some changes in pbx.c to resolve it at a deeper level. There
are already places that unset the SOFTHANGUP_ASYNCGOTO flag in an attempt to
abort the hangup process. My patch extends this to remove the END_OF_Q frame
from the channel's read queue, making the "abort hangup" more complete. This
same technique was used in every place where a softhangup flag was cleared.
Jason Parker [Fri, 21 Jan 2011 21:45:34 +0000 (21:45 +0000)]
Reset configuration before parsing users.conf.
Some values configured in chan_dahdi.conf were able to leak in to users.conf
configuration. This was surprising users, and potentially setting non-sane
"defaults".
Richard Mudgett [Wed, 19 Jan 2011 21:21:56 +0000 (21:21 +0000)]
DTMF transfer plays the wrong sounds for wrong number or other call failure.
* Set the default for features.conf.sample xferfailsound option to "beeperr"
as documented instead of "pbx-invalid" and corrected the use of it in DTMF
blind transfer (#1).
* Improved DTMF blind transfer handling of wrong numbers.
Most of the concerns in this issue were taken care of by the patch for
issue 17999: Issues with DTMF triggered attended transfers.
Richard Mudgett [Tue, 18 Jan 2011 18:04:36 +0000 (18:04 +0000)]
Issues with DTMF triggered attended transfers.
Issue #17999
1) A calls B. B answers.
2) B using DTMF dial *2 (code in features.conf for attended transfer).
3) A hears MOH. B dial number C
4) C ringing. A hears MOH.
5) B hangup. A still hears MOH. C ringing.
6) A hangup. C still ringing until "atxfernoanswertimeout" expires.
For v1.4 C will ring forever until C answers the dead line. (Issue #17096)
Problem: When A and B hangup, C is still ringing.
Issue #18395
SIP call limit of B is 1
1. A call B, B answered
2. B *2(atxfer) call C
3. B hangup, C ringing
4. Timeout waiting for C to answer
5. Recall to B fails because B has reached its call limit.
Because B reached its call limit, it cannot do anything until the transfer
it started completes.
Issue #17273
Same scenario as issue 18395 but party B is an FXS port. Party B cannot
do anything until the transfer it started completes. If B goes back off
hook before C answers, B hears ringback instead of the expected dialtone.
**********
Note for the issue #17273 and #18395 fix:
DTMF attended transfer works within the channel bridge. Unfortunately,
when either party A or B in the channel bridge hangs up, that channel is
not completely hung up until the transfer completes. This is a real
problem depending upon the channel technology involved.
For chan_dahdi, the channel is crippled until the hangup is complete.
Either the channel is not useable (analog) or the protocol disconnect
messages are held up (PRI/BRI/SS7) and the media is not released.
For chan_sip, a call limit of one is going to block that endpoint from any
further calls until the hangup is complete.
For party A this is a minor problem. The party A channel will only be in
this condition while party B is dialing and when party B and C are
conferring. The conversation between party B and C is expected to be a
short one. Party B is either asking a question of party C or announcing
party A. Also party A does not have much incentive to hangup at this
point.
For party B this can be a major problem during a blonde transfer. (A
blonde transfer is our term for an attended transfer that is converted
into a blind transfer. :)) Party B could be the operator. When party B
hangs up, he assumes that he is out of the original call entirely. The
party B channel will be in this condition while party C is ringing, while
attempting to recall party B, and while waiting between call attempts.
WARNING:
The ATXFER_NULL_TECH conditional is a hack to fix the problem. It will
replace the party B channel technology with a NULL channel driver to
complete hanging up the party B channel technology. The consequences of
this code is that the 'h' extension will not be able to access any channel
technology specific information like SIP statistics for the call.
ATXFER_NULL_TECH is not defined by default.
**********
Only offer codecs both sides support for directmedia
When using directmedia, Asterisk needs to limit the codecs offered to just
the ones that both sides recognize, otherwise they may end up sending audio
that the other side doesn't understand.
Jeff Peeler [Wed, 12 Jan 2011 18:10:42 +0000 (18:10 +0000)]
Fix CPU spike when pressing DTMF after agent login.
The problem here is that DTMF was being continuously deferred and requeued
since ast_safe_sleep is called in a loop. There are serveral other places in the
code that sleeps and then loops in a similar fashion. Because of this fact I
opted to not defer DTMF any more, which will not affect the original fix:
Terry Wilson [Tue, 4 Jan 2011 17:11:48 +0000 (17:11 +0000)]
Don't authenticate SUBSCRIBE re-transmissions
This only skips authentication on retransmissions that are already
authenticated. A similar method is already used for INVITES. This
is the kind of thing we end up having to do when we don't have a
transaction layer...
Tilghman Lesher [Fri, 17 Dec 2010 21:40:56 +0000 (21:40 +0000)]
Let Asterisk find better backtrace information with libbfd.
The menuselect option BETTER_BACKTRACES, if enabled, will use libbfd to search
for better symbol information within both the Asterisk binary, as well as
loaded modules, to assist when using inline backtraces to track down problems.
Jeff Peeler [Thu, 16 Dec 2010 20:46:52 +0000 (20:46 +0000)]
Fix improper hangup when doing an attended transfer to queue.
Had to indicate ringing in wait_for_answer so the attended transfer code would
not try and hang up the local channel it created, which would kill the call.
Sean Bright [Wed, 15 Dec 2010 21:28:29 +0000 (21:28 +0000)]
Fix reference and container leaks when running 'astobj2 test.'
We need to make sure that ao2_iterator_destroy is called once for each time that
ao2_iterator_init is called. Also make sure to unref a newly allocated object
that we've linked into a container.
Richard Mudgett [Mon, 13 Dec 2010 16:56:07 +0000 (16:56 +0000)]
Outgoing PRI/BRI calls cannot do DTMF triggered transfers.
Outgoing PRI/BRI calls cannot do DTMF triggered transfers if a PROCEEDING
message is not received. The debug output shows that the DTMF begin event
is seen, but the DTMF end event is missing. When the DTMF begin happens,
the call is muted so we now have one way audio (until a DTMF end event is
somehow seen).
* Made set the proceeding flag when the PRI_EVENT_ANSWER event is
received.
* Made absorb the DTMF begin and DTMF end events if we are overlap dialing
and have not seen a PROCEEDING message.
* Added a debug message when absorbing a DTMF event.
Terry Wilson [Thu, 9 Dec 2010 22:00:30 +0000 (22:00 +0000)]
Ignore spurious REGISTER requests
If a REGISTER request with a Call-ID matching an existing transaction is received
it was possible that the REGISTER request would overwrite the initreq of the
private structure. This info is used to generate messages for other responses in
the transaction. This patch ignores REGISTER requests that match non-REGISTER
transactions.
Jeff Peeler [Tue, 7 Dec 2010 22:57:48 +0000 (22:57 +0000)]
Revert code that changed SSRC for DTMF.
Some previous behavior was attempted to be restored, but mistakingly I did
not realize that the previous behavior was incorrect. This fixes DTMF not
being detected since DTMF shouldn't cause the SSRC to change.
Tilghman Lesher [Tue, 7 Dec 2010 00:07:37 +0000 (00:07 +0000)]
Don't create a Local channel if the target extension does not exist.
(closes issue #18126)
Reported by: junky
Patches:
followme.diff uploaded by junky (license 177)
(partially restructured by me to avoid a possible memory leak)
Jeff Peeler [Mon, 6 Dec 2010 21:57:15 +0000 (21:57 +0000)]
Improve handling of REGISTER requests with multiple contact headers.
The changes here attempt to more strictly follow RFC 3261 section 10.3.
Basically the following will now cause a 400 Bad Response to be returned, if:
- multiple Contact headers are present with one set to expire all bindings ("*")
- wildcard parameter is specified for Contact without Expires header or Expires
header is not set to zero.
Terry Wilson [Thu, 2 Dec 2010 18:00:27 +0000 (18:00 +0000)]
Initialize offset for adaptive jitter buffer
When the adaptive jitter buffer is enabled in sip.conf, the first frame placed
in the jitter buffer fails with something like:
jb_warning_output: Resyncing the jb. last_delay 0, this delay -215886466,
threshold 1000, new offset 215886466
This happens because the offset is not initialized before calling jb_put(). This
patch modifies jb_put_first_adaptive() to set the offset to the frame's
timestamp.
Russell Bryant [Thu, 2 Dec 2010 13:16:15 +0000 (13:16 +0000)]
Add "DAHDI" to a couple of app_meetme error messages.
This is in response to some questions on IRC. To the user, there was nothing
that made it obvious that this error had anything to do with DAHDI not being
loaded.
Jeff Peeler [Wed, 1 Dec 2010 17:50:09 +0000 (17:50 +0000)]
Fix not stopping MOH when transfered local channel queue member is answered.
The problem here is only present when local channels are used with the MOH
passthru option as well as no optimization (/nm). I will describe the slightly
bizarre scenario that was used to test, where phones B and C are queue members:
Phone A dials into a queue with two members using local channels and the above
options. Phone B answers. Phone A blind transfers phone B into the same queue.
Phone A hangs up. Phone C answers, but phone B didn't stop playing MOH.
In this scenario, the unhold frame that should have gotten to phone B never
arrived due to the masquerade from the blind transfer. This is usually fine
since app_queue manages the starting and stopping of MOH. However, with the
passthrough option enabled when app_queue attempts to stop MOH it tries to do
so on the local channel rather than the real channel. The easiest solution
was to just make sure to send an unhold frame during the transfer since it
wouldn't make sense to have MOH playing after a transfer anyway. This only
modifies SIP transfers, but the other transfers did not seem to be a problem.
If DTMF based transfers were a problem it might be okay to add ast_moh_stop
to finishup, but I didn't want to have to add that unless required.
Tilghman Lesher [Wed, 1 Dec 2010 00:20:05 +0000 (00:20 +0000)]
Get rid of the annoying startup and shutdown errors on OS X.
This mainly deals with the problem of constructors on platforms where an
explicit constructor order cannot be specified (any system with gcc 4.2
or less). However, this is only a problem on those systems where we need
to initialize mutexes with a constructor, because we have other code that
also relies upon constructors, and we cannot specify that mutexes are
initialized first (and destroyed last).
There are two approaches to dealing with this issue, related to whether the
code exists in the core Asterisk binary or in a separate code module. In the
core case, constructors are run immediately upon load, and the file_versions
list mutex needs to be already initialized, as it is referenced in the first
constructor within each core source file. In this case, we use pthread_once
to ensure that the mutex is initialized immediately before it is used for the
first time. The only caveat is that the mutex is not ever destroyed, but
because this is the core, it makes no real difference; the only time when
destruction is safe would be just prior to process destruction, which takes
care of that anyway. And due to using pthread_once, the mutex will never be
reinitialized, which means only one structure has leaked at the end of the
process. Hence, it is not a problematic leak.
The second approach is to use the load_module and unload_module routines,
which, for obvious reasons, exist only in loadable modules. In this second
case, we don't have a problem with the constructors, but only with destructor
order, because mutexes can be destroyed before their final usage is employed.
However, we need the mutexes to still be destroyed, in certain scenarios: if
the module is unloaded prior to the process ending, it should be clean, with
no allocations by the module hanging around after that point in time.
Russell Bryant [Wed, 24 Nov 2010 23:26:43 +0000 (23:26 +0000)]
Make Asterisk less crashy.
Since we might not put a new translation path on the channel, go ahead and
set it to NULL right after destroying the old one to ensure we don't try
to free an invalid translation path later on.
Richard Mudgett [Wed, 24 Nov 2010 22:41:07 +0000 (22:41 +0000)]
Oneway audio to SIP phone from FXS port after FXS port gets a CallWaiting pip.
The FXS connected phone has to have CW/CID support to fail, as it will
send back a DTMF 'A' or 'D' when it's ready to receive CallerID. A normal
phone with no CID never fails. Also the SIP phone does not hear MOH when
the CW call is answered.
The DTMF end frame is suppressed when the phone acknowledges the CW signal
for CID. The problem is the DTMF begin frame needs to be suppressed as
well. The DTMF begin frame is causing SIP to start sending the DTMF RTP
frames. Since the DTMF end frame is suppressed, SIP will not stop sending
those DTMF RTP packets.
* Suppress the DTMF begin and end frames when the channel driver is
looking for DTMF digits.
* Fixed a couple issues caused by not cleaning up the CID spill if you
answer the CW call while it is sending the CID spill.
* Fixed not sending CW/CID spill to the phone when the call is natively
bridged. (Fixed by not using native bridge if CW/CID is possible.)
* Suppress received audio when sending CW/CID spills. The other parties
involved do not need to hear the CW/CID spills and may be confused if the
CW call is for them.
* v1.4 does not have the main problem fixed by suppressing the DTMF start
frames. The other three items fixed are relevant.
* If you really must restore native bridging between analog ports, you
need to disable CW/CID either by configuring chan_dahdi.conf
callwaitingcallerid=no or dialing *70 before dialing the number to
temporarily disable CW.
Russell Bryant [Wed, 24 Nov 2010 20:22:32 +0000 (20:22 +0000)]
Fix false reporting of an error by set_format().
In the case that the native format was able to be changed to match the
new requested format, the code proceeded to attempt to build a translation
path, anyway. The result would be NULL, since no translation path is
necessary and resulted in this function thinking an error has occurred.
This case is now specifically caught and no attempt to build a translation
path is attempted.
Thanks to our automated tests and bamboo.asterisk.org for catching this problem
and making a whole lot of noise when things started failing. :-)
Russell Bryant [Wed, 24 Nov 2010 16:48:39 +0000 (16:48 +0000)]
Handle failures building translation paths more effectively.
The problem scenario occurred on a heavily loaded system that was using the
codec_dahdi module and exceeded the hardware transcoding capacity. The failure
mode at that point was not good. The report came in to us as an Asterisk
lock-up. The "core show locks" shows a ton of threads locked up (but no
obvious deadlock). Upon deeper investigation, when the system is in this
state, the CPU was maxed out. The CPU was being consumed by the Asterisk
logger spewing messages on every audio frame for calls set up after transcoder
capacity was reached.
The purpose of this patch is to make Asterisk handle failures to create a
translation path in a more graceful manner. If we can't translate, then the
call just needs to be dropped, as it's not going to work. These are the
changes:
1) In set_format() of channel.c (which is called by set_read_format() and
set_write_format()), it was ignoring if ast_translator_build_path() failed and
returned NULL. It now pays attention to that case and returns a result
reflecting failure. With this change in place, the bridging code will
immediately detect a failure and end the bridge instead of proceeding to try to
bridge frames that can't be translated and making channel drivers freak out by
sending them frames in a format they weren't expecting.
2) In ast_indicate_data() of channel.c, failure of ast_playtones_start() was
ignored. It is now reflected in the return value of the function. This didn't
turn out to have any affect on the bug, but seemed like a good change to leave
in.
3) In app_dial(), when only sending a call to a single endpoint, it will
attempt to do some bridging of its own of early audio. It uses
make_compatible() when it's going to do this. However, it ignored failure from
make compatible. So, even with the fix from #1, if there was early audio going
through app_dial, there would still be a period of invalid frames passing
through. After detecting failure here, Dial() exits.