Richard Mudgett [Tue, 6 Oct 2015 23:01:37 +0000 (18:01 -0500)]
res_pjsip: Fix deadlock when sending out-of-dialog requests.
The struct send_request_wrapper has a pjsip lock associated with it that
is created non-recursive. There is a code path for the struct
send_request_wrapper lock that will attempt to lock it recursively. The
reporter's deadlock showed that the thread calling endpt_send_request()
deadlocked itself right after the wrapper object got created.
Out-of-dialog requests such as MESSAGE, qualify OPTIONS, and unsolicited
MWI NOTIFY messages can hit this deadlock.
* Replaced the struct send_request_wrapper pjsip lock with the mutex lock
that can come with an ao2 object since all of Asterisk's mutexes are
recursive. Benefits include removal of code maintaining the pjsip
non-recursive lock since ao2 objects already know how to maintain their
own lock and the lock will show up in the CLI "core show locks" output.
StefanEng86 [Tue, 6 Oct 2015 16:05:00 +0000 (18:05 +0200)]
res/res_rtp_asterisk.c: Fix incorrect assignment of frame->subclass.frame_ending
In ast_rtp_read, the value of the variable 'mark' which we try to assign to a
frame->subclass.frame_ending may be 0, 1 or (1<<23), but we should translate
it to 0 or 1.
Matt Jordan [Wed, 7 Oct 2015 01:43:58 +0000 (20:43 -0500)]
res/res_rtp_asterisk: Fix assignment after ao2 decrement
When we decide we will no longer schedule an RTCP write, we remove the
reference to the RTP instance, then assign -1 to the stored scheduler ID
in case something else comes along and wants to see if anything is scheduled.
That scheduler ID is on the RTP instance. After 60a9172d7ef2 was merged to
fix the regression introduced by 3cf0f29310, this improper assignment on a
potentially destroyed object started getting tripped on the build agents.
Frankly, this should have been crashing a lot more often earlier. I can only
assume that the timing was changed just enough by both changes to start
actually hitting this problem.
As it is, simply moving the assignment prior to the ao2 deference is sufficient
to keep the RTP instance from being referenced when it is very, truly,
aboslutely dead.
(Note that it is still good practice to assign -1 to the scheduler ID when we
know we won't be scheduling it again, as the ao2 deref *may* not always destroy
the ao2 object.)
chan_sip: Fix port parsing for IPv6 addresses in SIP Via headers.
If a Via header containes an IPv6 address and a port number is ommitted,
as it is the standard port, we now leave the port empty and to not set it
to the value after the first colon of the IPv6 address.
Matt Jordan [Tue, 6 Oct 2015 02:34:41 +0000 (21:34 -0500)]
Fix improper usage of scheduler exposed by 5c713fdf18f
When 5c713fdf18f was merged, it allowed for scheduled items to have an ID of
'0' returned. While this was valid per the documentation for the API, it was
apparently never returned previously. As a result, several users of the
scheduler API viewed the result as being invalid, causing them to reschedule
already scheduled items or otherwise fail in interesting ways.
This patch corrects the users such that they view '0' as valid, and a returned
ID of -1 as being invalid.
Note that the failing HEP RTCP tests now pass with this patch. These tests
failed due to a duplicate scheduling of the RTCP transmissions.
Debian Amtelco [Wed, 26 Aug 2015 21:58:04 +0000 (21:58 +0000)]
chan_pjsip: Add Referred-By header to the PJSIP REFER packet.
Some systems require the REFER packet to include a Referred-By header.
If the channel variable SIPREFERREDBYHDR is set, it passes that value as the
Referred-By header value. Otherwise, it adds the current dialog’s local info.
Ivan Poddubny [Sat, 3 Oct 2015 11:27:27 +0000 (14:27 +0300)]
manager: Fix GetConfigJSON returning invalid JSON
When GetConfigJSON was introduced back in 1.6, it returned each
section as an array of strings: ["key=value", "key2=value2"].
Afterwards, it was changed a few times and became
["key": "value", "key2": "value2"], which is not a correct JSON.
This patch fixes that by constructing a JSON object {} instead of
an array [].
Also, the keys "istemplate" and "tempates" that are used to
indicate templates and their inherited categories are now wrapped in
quotes.
Richard Mudgett [Wed, 30 Sep 2015 22:28:19 +0000 (17:28 -0500)]
res_sorcery_memory_cache.c: Fix deadlock with scheduler.
A deadlock can happen when a sorcery object is being expired from the
memory cache when at the same time another object is being placed into the
memory cache. There are a couple other variations on this theme that
could cause the deadlock. Basically if an object is being expired from
the sorcery memory cache at the same time as another thread tries to
update the next object expiration timer the deadlock can happen.
* Add a deadlock avoidance loop in expire_objects_from_cache() to check if
someone is trying to remove the scheduler callback from the scheduler.
Richard Mudgett [Thu, 1 Oct 2015 19:27:34 +0000 (14:27 -0500)]
res_sorcery_memory_cache.c: Shutdown in a less crash potential order.
Basically you should shutdown in the opposite order of how you setup since
later setup pieces likely depend on earlier setup pieces. e.g.,
Registering your external API with the rest of the system should be the
last thing setup and the first thing unregistered during shutdown.
Matt Jordan [Mon, 28 Sep 2015 01:45:50 +0000 (20:45 -0500)]
res/res_stasis: Fix accidental subscription to 'all' bridge topic
When b99a7052621700a1aa641a1c24308f5873275fc8 was merged, subscribing to a
NULL bridge will now cause app_subscribe_bridge to implicitly subscribe to
all bridges. Unfortunately, the res_stasis control loop did not check that
a bridge changing on a channel's control object was actually also non-NULL.
As a result, app_subscribe_bridge will be called with a NULL bridge when a
channel leaves a bridge. This causes a new subscription to be made to the
bridge. If an application has also subscribed to the bridge, the application
will now have two subscriptions:
(1) The explicit one created by the app
(2) The implicit one accidentally created by the control structure
As a result, the 'BridgeDestroyed' event can be sent multiple times. This
patch corrects the control loop such that it only subscribes an application
to a new bridge if the bridge pointer is non-NULL.
Richard Mudgett [Thu, 24 Sep 2015 19:56:24 +0000 (14:56 -0500)]
app_queue.c: Force COLP update if outgoing channel name changed.
* When a call is answered and the outgoing channel name has changed then
force a connected line update because the channel is no longer the same.
The channel was masqueraded into by another channel. This is usually
because of a call pickup.
Note: Forwarded calls are handled in a controlled manner so the original
channel name is replaced with the forwarded channel.
Richard Mudgett [Thu, 24 Sep 2015 17:59:08 +0000 (12:59 -0500)]
app_dial.c: Force COLP update if outgoing channel name changed.
* When a call is answered and the outgoing channel name has changed then
force a connected line update because the channel is no longer the same.
The channel was masqueraded into by another channel. This is usually
because of a call pickup.
Note: Forwarded calls are handled in a controlled manner so the original
channel name is replaced with the forwarded channel.
Mark Michelson [Wed, 23 Sep 2015 19:02:15 +0000 (14:02 -0500)]
logger: Prevent duplicate dynamic channels from being added.
There was a problem observed where the "logger add channel" CLI command
would allow for a channel with the same name to be added multiple times.
This would result in each message being written out to the same file
multiple times.
The problem was due to the difference in how logger channel filenames
are stored versus the format they are allowed to be presented when they
are added. For instance, if adding the logger channel "foo" through the
CLI, the result would be a logger channel with the file name
/var/log/asterisk/foo being stored. So when trying to add another "foo"
channel, "foo" would not match "/var/log/asterisk/foo" so we'd happily
add the duplicate channel.
The fix presented here is to introduce two new methods in the logger
code:
* make_filename(): given a logger channel name, this creates the
filename for that logger channel.
* find_logchannel(): given a logger channel name, this calls
make_filename() and then traverses the list of logchannels in order
to find a match.
This change has made use of make_filename() and find_logchannel()
throughout to more consistently behave.
Mark Michelson [Thu, 24 Sep 2015 19:49:46 +0000 (14:49 -0500)]
Do not swallow frames on channels leaving bridges.
When leaving a bridge, indications on a channel could be swallowed by
the internal indication logic because it appears that the channel is on
its way to be hung up anyway. One such situation where this is
detrimental is when channels on hold are redirected out of a bridge. The
AST_CONTROL_UNHOLD indication from the bridging code is swallowed,
leaving the channel in question to still appear to be on hold.
The fix here is to modify the logic inside ast_indicate_data() to not
drop the indication if the channel is simply leaving a bridge. This way,
channels on hold redirected out of a bridge revert to their expected "in
use" state after the redirection.
Richard Mudgett [Tue, 22 Sep 2015 22:08:49 +0000 (17:08 -0500)]
app_page.c: Fix crash when forwarding with a predial handler.
Page uses the async method of dialing with the dial API. When a call gets
forwarded there is no calling channel available. If the predial handler
was set then the calling channel could not be put into auto-service
for the forwarded call because it doesn't exist. A crash is the result.
* Moved the callee predial parameter string processing to before the
string is passed to the dial API rather than having the dial API do it.
There are a few benefits do doing this. The first is the predial
parameter string processing doesn't need to be done for each channel
called by the dial API. The second is in async mode and the forwarded
channel is to have the predial handler executed on it then the
non-existent calling channel does not need to be present to process the
predial parameter string.
* Don't start auto-service on a non-existent calling channel to execute
the predial handler when the dial API is in async mode and forwarding a
call.
Matt Jordan [Fri, 4 Sep 2015 02:19:21 +0000 (21:19 -0500)]
ARI: Add events for Contact and Peer Status changes
This patch adds support for receiving events regarding Peer status changes
and Contact status changes. This is particularly useful in scenarios where
we are subscribed to all endpoints and channels, where we often want to know
more about the state of channel technology specific items than a single
endpoint's state.
Matt Jordan [Fri, 4 Sep 2015 17:24:57 +0000 (12:24 -0500)]
res/res_stasis_device_state: Allow for subscribing to 'all' device state
This patch adds support for subscribing to all device state changes. This is
done either by subscribing to an empty device, e.g., 'eventSource=deviceState:',
or by the WebSocket connection specifying that it wants all state in the
system.
Matt Jordan [Fri, 4 Sep 2015 17:25:07 +0000 (12:25 -0500)]
ARI: Add the ability to subscribe to all events
This patch adds the ability to subscribe to all events. There are two possible
ways to accomplish this:
(1) On initial WebSocket connection. This patch adds a new query parameter,
'subscribeAll'. If present and True, Asterisk will subscribe the
applications to all ARI events.
(2) Via the applications resource. When subscribing in this manner, an ARI
client should merely specify a blank resource name, i.e., 'channels:'
instead of 'channels:12354'. This will subscribe the application to all
resources of the 'channels' type.
core/logging: Fix logging to more than one syslog channel
Currently, Asterisk will log to the last configured syslog
channel in logger.conf. This is due to the fact that the
final call to openlog() supersedes all of the previous calls.
This commit removes the call to openlog() and passes the
facility to ast_log_vsyslog(), along with utilizing the
LOG_MAKEPRI macro to ensure that the message is routed to
the correct facility and with the correct priority.
pbx: Update device and presence state when changing a hint extension.
When changing a hint extension without removing the hint first the
device state and presence state is not updated. This causes the state
of the hint to be that of the previous extension and not the current
one. This state is kept until a state change occurs as a result of
something (presence state change, device state change).
This change updates the hint with the current device and presence
state of the new extension when it is changed. Any state callbacks
which may have been added before the hint extension is changed are
also informed of the new device and presence state if either have
changed.
Walter Doekes [Thu, 17 Sep 2015 09:52:09 +0000 (11:52 +0200)]
chan_sip: Fix From header truncation for extremely long CALLERID(name).
The CALLERID(num) and CALLERID(name) and other info are placed into the
`char from[256]` in initreqprep. If the name was too long, the addr-spec
and params wouldn't fit.
Code is moved around so the addr-spec with params is placed there first,
and then fitting in as much of the display-name as possible.
Kevin Harwell [Thu, 17 Sep 2015 21:47:33 +0000 (16:47 -0500)]
app_queue: AgentComplete event has wrong reason
When a queued caller transfers an agent to another extension sometimes the
raised AgentComplete event has a reason of "caller" and sometimes "transfer".
Since a transfer has taken place this should always be transfer. This occurs
because sometimes the stasis hangup event arrives before the transfer event
thus writing a different reason out.
With this patch, when a hangup event is received during a transfer it will
check to see if the channel that is hanging up is part of a transfer. If so
it will return and let the subsequently received transfer event handler take
care of the cleanup.
Kevin Harwell [Thu, 17 Sep 2015 16:31:15 +0000 (11:31 -0500)]
app_queue: Crash when transferring
During some transfer scenarios involving queues Asterisk would sometimes
crash when trying to obtain a channel snapshot (could happen on caller or
member channels). This occurred because the underlying channel had already
disappeared when trying to obtain the latest snapshot.
This patch adds a reference to both the member and caller channels that
extends to the lifetime of the queue'd call, thus making sure the channels
will always exist when retrieving the latest snapshots.
Mark Michelson [Wed, 16 Sep 2015 22:36:32 +0000 (17:36 -0500)]
res_pjsip_pubsub: Eliminate race during initial NOTIFY.
There is a slim chance of a race condition occurring where two threads
can both attempt to manipulate the same area.
Thread A can be handling an incoming initial SUBSCRIBE request. Thread A
lets the specific subscription handler know that the subscription has
been established.
At this point, Thread B may detect a state change on the subscribed
resource and queue up a notification task on Thread C, the subscription
serializer thread.
Now Thread A attempts to generate the initial NOTIFY request to send to
the subscriber at the same time that Thread C attempts to generate a
state change NOTIFY request to send to the subscriber.
The result is that Threads A and C can step on the same memory area,
resulting in a crash. The crash has been observed as happening when
attempting to allocate more space to hold the body for the NOTIFY.
The solution presented here is to queue the subscription establishment
and initial NOTIFY generation onto the subscription serializer thread
(Thread C in the above scenario). This way, there is no way that a state
change notification can occur before the initial NOTIFY is sent, and if
there is a quick succession of NOTIFYs, we can guarantee that the two
NOTIFY requests will be sent in succession.
Alexander Traud [Fri, 28 Aug 2015 20:42:23 +0000 (22:42 +0200)]
translate: Fix transcoding while different in frame size.
When Asterisk translates between codecs, each with a different frame size (for
example between iLBC 30 and Speex-WB), too large frames were created by
ast_trans_frameout. Now, ast_trans_frameout is called with the correct frame
length, creating several frames when necessary. Affects all transcoding modules
which used ast_trans_frameout: GSM, iLBC, LPC10, and Speex.
Mark Michelson [Thu, 10 Sep 2015 22:19:26 +0000 (17:19 -0500)]
scheduler: Use queue for allocating sched IDs.
It has been observed that on long-running busy systems, a scheduler
context can eventually hit INT_MAX for its assigned IDs and end up
overflowing into a very low negative number. When this occurs, this can
result in odd behaviors, because a negative return is interpreted by
callers as being a failure. However, the item actually was successfully
scheduled. The result may be that a freed item remains in the scheduler,
resulting in a crash at some point in the future.
The scheduler can overflow because every time that an item is added to
the scheduler, a counter is bumped and that counter's current value is
assigned as the new item's ID.
This patch introduces a new method for assigning scheduler IDs. Instead
of assigning from a counter, a queue of available IDs is maintained.
When assigning a new ID, an ID is pulled from the queue. When a
scheduler item is released, its ID is pushed back onto the queue. This
way, IDs may be reused when they become available, and the growth of ID
numbers is directly related to concurrent activity within a scheduler
context rather than the uptime of the system.
Change validation on reload module because now used the cli function for
reload. The sip_reload() function never fail and ever return NULL for this
reason on reload() now use the call the sip_reload() and return
AST_MODULE_LOAD_SUCCESS.
This problem is dectected on reload by PUT method on ARI, getting always
404 http code when the module is reloaded.
Mark Michelson [Thu, 10 Sep 2015 14:49:45 +0000 (09:49 -0500)]
res_pjsip: Copy default_from_user to avoid crash.
The default_from_user retrieval function was pulling the
default_from_user from the global configuration struct in an unsafe way.
If using a database as a backend configuration store, the global
configuration struct is short-lived, so grabbing a pointer from it
results in referencing freed memory.
The fix here is to copy the default_from_user value out of the global
configuration struct.
Thanks go to John Hardin for discovering this problem and proposing the
patch on which this fix is based.
Matt Jordan [Thu, 10 Sep 2015 13:39:21 +0000 (08:39 -0500)]
res/res_pjsip_nat: Ignore REGISTER requests when looking for a Record-Route
We will only rewrite the Contact header if there is no Record-Route header in
the received request. If a malfunctioning proxy places a Record-Route header
into a REGISTER request, we will decide that we shouldn't update the IP/port
in the Contact header, and we will end up storing a contact with an AoR that
contains the NAT'd IP address.
While it is nice to have the proxy *not* send a Record-Route in a REGISTER
request, it's also a good idea to not process the header in a non-dialog
message. This patch updates the code to explicitly ignore the Record-Route
header in REGISTER requests.
Matt Jordan [Fri, 4 Sep 2015 02:15:13 +0000 (21:15 -0500)]
main/config_options: Check for existance of internal object before derefing
Asterisk can load and register an object type while still having an invalid
sorcery mapping. This can cause an issue when a creation call is invoked.
For example, mis-configuring PJSIP's endpoint identifier by IP address mapping
in sorcery.conf will cause the sorcery mechanism to be invalidated; however, a
subsequent ARI invocation to create the object will cause a crash, as the
internal type may not be registered as sorcery expects.
Merely checking for a NULL pointer here solves the issue.
Jonathan Rose [Thu, 3 Sep 2015 19:07:35 +0000 (14:07 -0500)]
ParkAndAnnounce: Add variable inheritance
In Asterisk 11, the announcer channel would receive channel variables
from the channel being parked by means of normal channel inheritance.
This functionality was lost during the big res_parking project in
Asterisk 12. This patch restores that functionality.
David M. Lee [Fri, 4 Sep 2015 21:33:39 +0000 (16:33 -0500)]
res_rtp_asterisk: Add more ICE debugging
In working through a recent ICE negotiation bug, I found the debug
logging in res_rtp_asterisk to be lacking. This patch adds a number of
debug and warning statements that were helpful.
res_pjsip: Use hash for contact object identity instead of Contact URI.
In the wild it is possible for Contact URIs to be quite long as
parameters can exist on them. This can present a problem when storing
them in the AstDB as the URI is used as part of the object name and
there is a fixed length limit for the AstDB. This will cause
the contact to not get stored.
This change uses the MD5 hash of the Contact URI as part of the
object name instead. This has a fixed length which is guaranteed
to not exceed the AstDB length limit.
Matt Jordan [Mon, 7 Sep 2015 16:15:59 +0000 (11:15 -0500)]
res/res_pjsip: Purge contacts when an AoR is deleted
When an AoR is deleted by an external mechanism, such as through ARI, we
currently do not remove dynamic contacts that were created for that AoR as a
result of a received REGISTER request. As a result, re-creating the AoR will
cause the dynamic contact to be interpreted as a persistent contact, leading
to some rather strange state being created for the contacts/endpoints.
This patch adds a sorcery observer for the 'aor' object. When a delete is
issued on the underlying sorcery object, the observer is called, and all
contacts created and persisted in sorcery for that AoR are also removed. Note
that we don't want to perform this action when an AO2 object that is an AoR is
destroyed, as the AoR can still exist in the backing storage (and we would
thus be removing valid contacts from an AoR that still "exists".)