George Joseph [Wed, 30 Mar 2016 23:34:42 +0000 (17:34 -0600)]
pjproject_bundled: Fix use of LDCONFIG for shared library link creation
LDCONFIG apparently isn't set to something sane on all systems so the creation
of the shared library links fails. Instead of just testing for non-blank,
main/Makefile now checks that LDCONFIG is actually executable and reverts to
LN if it isn't.
This applies to both libasteriskpj and libasteriskssl.
Thanks to 'abelbeck' for pointing out that the issue was LDCONFIG.
ASTERISK-25873 #close Reported-by: Hans van Eijsden
Change-Id: I25b76379bc637726ec044b2c0e709b56b3701729
Richard Mudgett [Tue, 29 Mar 2016 18:47:08 +0000 (13:47 -0500)]
res_stasis: Add control ref to playback and recording structs.
The stasis_app_playback and stasis_app_recording structs need to have a
struct stasis_app_control ref. Other threads can get a reference to the
playback and recording structs from their respective global container.
These other threads can then use the control pointer they contain after
the control struct has gone.
* Add control ref to stasis_app_playback and stasis_app_recording structs.
With the refs added, the control command queue can now have a circular
control reference which will cause the control struct to never get
released if the control's command queue is not flushed when the channel
leaves the Stasis application. Also the command queue needs better
protection from adding commands if the control->is_done flag is set.
Richard Mudgett [Mon, 28 Mar 2016 23:10:40 +0000 (18:10 -0500)]
res_stasis: Fix crash on a hanging up channel.
* Give the struct stasis_app_control ao2 object a ref to the channel held
in the object. Now the channel will still be around if a thread needs to
post a stasis message instead of crash because the topic was destroyed.
* Moved stopping any lingering silence generator out of the struct
stasis_app_control destructor and made it a part of exiting the Stasis
application. Who knows which thread the destructor will be called under
so it cannot affect the channel's silence generator. Not only was the
channel unprotected when the silence generator was stopped, stasis may no
longer even control the channel.
Richard Mudgett [Tue, 29 Mar 2016 23:06:24 +0000 (18:06 -0500)]
res_ari: Cannot get control also means channel is unavailable.
The only caller of ari_bridges_play_found() has this note:
If ari_bridges_play_found fails because the channel is unavailable for
playback, The channel will be removed from the playback list soon. We can
keep trying to get channels from the list until we either get one that
will work or else there isn't a channel for this bridge anymore, in which
case we'll revert to ari_bridges_play_new.
George Joseph [Wed, 30 Mar 2016 14:46:32 +0000 (08:46 -0600)]
res_rtp_asterisk: Fix placement of txcount increment
Commit 1bce690ccb36a4744a327c07af23a9a3a0fa20cd was incrementing txcount
for rtcp packets as well as rtp packets and that was causing sender reports
to be generated instead of receiver reports in cases where no rtp was actually
being sent.
Moved the txcount increment from __rtp_sento, which handles both rtp and rtcp,
to rtp_sento which only handles rtp packets.
George Joseph [Sun, 27 Mar 2016 03:33:14 +0000 (21:33 -0600)]
chan_pjsip: Add 'pjsip show channelstats'
Added the ability to show channel statistics to chan_pjsip (cli_functions.c)
Moved the existing 'pjsip show channel(s)' functionality from
pjsip_configuration to cli_functions.c. The stats needed chan_pjsip's
private header so it made sense to move the existing channel commands as well.
Now using stasis_cache_dump to get the channel snapshots rather than retrieving
all endpoints, then getting each one's channel snapshots. Much more efficient.
Jacek Konieczny [Fri, 25 Mar 2016 15:42:12 +0000 (16:42 +0100)]
app_echo: forward and generate VIDUPDATE frames
When using app_echo via WebRTC with VP8 video the video would appear
only after a few minutes, because there would be nothing to request
a full reference frame.
This fixes the problem in both ways:
- echos any VIDUPDATE frames received on the channel
- sends one such frame when first video frame is to be forwarded
This makes the echo work with Firefox and Chrome WebRTC implementation.
George Joseph [Sun, 27 Mar 2016 17:53:16 +0000 (11:53 -0600)]
res_rtp_asterisk: Fix packet stats on bridged connection
rxcount, txcount, rxoctetcount and txoctetcount weren't being calculated
for bridged streams because the calulations were being done after the
bridged short-circuit. Actually, rxoctetcount wasn't ever being calculated.
Moved the calculations so they occur for all valid received packets and
all transmitted packets. Also added rxoctetcount and txoctetcount to
ast_rtp_instance_stat.
George Joseph [Fri, 11 Mar 2016 01:52:14 +0000 (18:52 -0700)]
res_pjsip/pjsip_options: Fix From generation on outgoing OPTIONS
No one seemed to notice but every time an OPTIONS goes out, it goes
out with a From of "asterisk" (or whatever the default from_user is set to),
even if you specify an endpoint.
The issue had several causes...
qualify_contact is only called with an endpoint if called from the CLI.
If the endpoint is NULL, qualify_contact only looks up the endpoint if
authenticate_qualify=yes. Even then, it never passes it on to
ast_sip_create_request where the From header is set. Therefore From
is always "asterisk" (or whatever the default from_user is set to).
Even if ast_sip_create_request were to get an endpoint, it only sets
the From if endpoint->from_user is set.
The fix is 4 parts...
First, create_out_of_dialog_request was modified to use the endpoint id
if endpoint was specified and from_user is not set.
Second, qualify_contact was modified to always look up an endpoint if
one wasn't specified regardless of authenticate_qualify. It then passes
the endpoint on to create_out_of_dialog_request.
Third (and most importantly), find_an_endpoint was modified to find
an endpoint by using an "aors LIKE %contact->aor%" predicate with
ast_sorcery_retrieve_by_fields. As such, this patch will only work
if the sorcery realtime optimizations patch goes in. Otherwise we'd
be pulling the entire endpoints database every time we send an OPTIONS.
Since we already know the contact's aor, the on_endpoint callback was also
modified to just check if the contact->aor is an exact match to one of
the endpoint's.
Finally, since we now have an endpoint for every OPTIONS request,
res_pjsip/endpt_send_request (which handles out-of-dialog reqests) was
updated to get the transport from the endpoint and set it on tdata.
Now the correct transport is used.
George Joseph [Tue, 8 Mar 2016 21:55:30 +0000 (14:55 -0700)]
sorcery/res_pjsip: Refactor for realtime performance
There were a number of places in the res_pjsip stack that were getting
all endpoints or all aors, and then filtering them locally.
A good example is pjsip_options which, on startup, retrieves all
endpoints, then the aors for those endpoints, then tests the aors to see
if the qualify_frequency is > 0. One issue was that it never did
anything with the endpoints other than retrieve the aors so we probably
could have skipped a step and just retrieved all aors. But nevermind.
This worked reasonably well with local config files but with a realtime
backend and thousands of objects, this was a nightmare. The issue
really boiled down to the fact that while realtime supports predicates
that are passed to the database engine, the non-realtime sorcery
backends didn't.
They do now.
The realtime engines have a scheme for doing simple comparisons. They
take in an ast_variable (or list) for matching, and the name of each
variable can contain an operator. For instance, a name of
"qualify_frequency >" and a value of "0" would create a SQL predicate
that looks like "where qualify_frequency > '0'". If there's no operator
after the name, the engines add an '=' so a simple name of
"qualify_frequency" and a value of "10" would return exact matches.
The non-realtime backends decide whether to include an object in a
result set by calling ast_sorcery_changeset_create on every object in
the internal container. However, ast_sorcery_changeset_create only does
exact string matches though so a name of "qualify_frequency >" and a
value of "0" returns nothing because the literal "qualify_frequency >"
doesn't match any name in the objset set.
So, the real task was to create a generic string matcher that can take a
left value, operator and a right value and perform the match. To that
end, strings.c has a new ast_strings_match(left, operator, right)
function. Left and right are the strings to operate on and the operator
can be a string containing any of the following: = (or NULL or ""), !=,
>, >=, <, <=, like or regex. If the operator is like or regex, the
right string should be a %-pattern or a regex expression. If both left
and right can be converted to float, then a numeric comparison is
performed, otherwise a string comparison is performed.
To use this new function on ast_variables, 2 new functions were added to
config.c. One that compares 2 ast_variables, and one that compares 2
ast_variable lists. The former is useful when you want to compare 2
ast_variables that happen to be in a list but don't want to traverse the
list. The latter will traverse the right list and return true if all
the variables in it match the left list.
Now, the backends' fields_cmp functions call ast_variable_lists_match
instead of ast_sorcery_changeset_create and they can now process the
same syntax as the realtime engines. The realtime backend just passes
the variable list unaltered to the engine. The only gotcha is that
there's no common realtime engine support for regex so that's been noted
in the api docs for ast_sorcery_retrieve_by_fields.
Only one more change to sorcery was done... A new config flag
"allow_unqualified_fetch" was added to reg_sorcery_realtime.
"no": ignore fetches if no predicate fields were supplied.
"error": same as no but emit an error. (good for testing)
"yes": allow (the default);
"warn": allow but emit a warning. (good for testing)
Now on to res_pjsip...
pjsip_options was modified to retrieve aors with qualify_frequency > 0
rather than all endpoints then all aors. Not only was this a big
improvement in realtime retrieval but even for config files there's an
improvement because we're not going through endpoints anymore.
res_pjsip_mwi was modified to retieve only endpoints with something in
the mailboxes field instead of all endpoints then testing mailboxes.
res_pjsip_registrar_expire was completely refactored. It was retrieving
all contacts then setting up scheduler entries to check for expiration.
Now, it's a single thread (like keepalive) that periodically retrieves
only contacts whose expiration time is < now and deletes them. A new
contact_expiration_check_interval was added to global with a default of
30 seconds.
Ross Beer reports that with this patch, his Asterisk startup time dropped
from around an hour to under 30 seconds.
There are still objects that can't be filtered at the database like
identifies, transports, and registrations. These are not going to be
anywhere near as numerous as endpoints, aors, auths, contacts however.
Back to allow_unqualified_fetch. If this is set to yes and you have a
very large number of objects in the database, the pjsip CLI commands
will attempt to retrive ALL of them if not qualified with a LIKE.
Worse, if you type "pjsip show endpoint <tab>" guess what's going to
happen? :) Having a cache helps but all the objects will have to be
retrieved at least once to fill the cache. Setting
allow_unqualified_fetch=no prevents the mass retrieve and should be used
on endpoints, auths, aors, and contacts. It should NOT be used for
identifies, registrations and transports since these MUST be
retrieved in bulk.
Richard Mudgett [Sat, 26 Mar 2016 04:19:22 +0000 (23:19 -0500)]
res_parking: Fix blind transfer dynamic lots creation.
Blind transfers to a recognized parking extension need to use the parker's
channel variable values to create the dynamic parking lot. This is
because there is always only one parker while the parkee may actually be a
multi-party bridge. A multi-party bridge can never supply the needed
channel variables to create the dynamic parking lot. In the multi-party
bridge blind transfer scenario, the parker's CHANNEL(parkinglot) value and
channel variables are inherited by the local channel used to park the
bridge.
* In park_common_setup(), make use the parker instead of the parkee to
supply the dynamic parking lot channel variable values. In all but one
case, the parkee is the same as the parker. However, in the recognized
parking extension blind transfer scenario for a two party bridge they are
different channels. For consistency, we need to use the parker channel.
* In park_local_transfer(), pass the CHANNEL(parkinglot) value to the
local channel when blind transferring a multi-party bridge to a recognized
parking extension.
* When a local channel starts a call, the Local;2 side needs to inherit
the CHANNEL(parkinglot) value from Local;1.
The DTMF one-touch parking case wasn't even trying to create dynamic
parking lots before it aborted the attempt.
* In parking_park_call(), add missing code to create a dynamic parking
lot.
A DTMF bridge hook is documented as returning -1 to remove the hook.
Though the hook caller is really coded to accept non-zero. See the
ast_bridge_hook_callback typedef.
* In feature_park_call(), don't remove the DTMF one-touch parking hook
because of an error.
ASTERISK-24605 #close
Reported by: Philip Correia
Patches:
call_park.patch (license #6672) patch uploaded by Philip Correia
Richard Mudgett [Fri, 18 Mar 2016 19:01:02 +0000 (14:01 -0500)]
res_parking: Misc fixes.
res/parking/parking_applications.c:
* Add malloc fail checks in setup_park_common_datastore().
* Fix playing parking failed announcement to only happen on non-blind
transfers in park_app_exec(). It could never go out before because a test
was provedly always false.
res/parking/parking_bridge.c:
* Fix NULL tolerance in generate_parked_user() because
bridge_parking_push() can theoretically pass a NULL parker channel if the
parker channel went away for some reason.
* Clarify some weird code dealing with blind_transfer in
bridge_parking_push().
res/parking/parking_bridge_features.c:
* Made park_local_transfer() set BLINDTRANSFER on the Local;1 channel
which will be bulk copied to the Local;2 channel on the subsequent
ast_call(). The additional advantage is if the parker channel has the
BLINDTRANSFER and ATTENDEDTRANSFER variables set they are now guaranteed
to be overridden.
res/parking/parking_manager.c:
* Fix AMI Park action input range checking of the Timeout header in
manager_park().
* Reduced locking scope to where needed in manager_park().
res/res_parking.c:
* Fix some off nominal missing unlocks by eliminating the returns.
Joshua Colp [Fri, 25 Mar 2016 11:02:41 +0000 (08:02 -0300)]
media_cache: Demote warning to debug as it may occur often.
The file playback system will now query the media cache and then
the old file functionality. Under normal conditions this will result
in the cache failing to retrieve a file causing a warning message
to get output each time a file is played back.
This change demotes this warning to a debug message.
Mark Michelson [Thu, 10 Mar 2016 22:58:49 +0000 (16:58 -0600)]
Restrict CLI/AMI commands on shutdown.
During stress testing, we have frequently seen crashes occur because a
CLI or AMI command attempts to access information that is in the process
of being destroyed.
When addressing how to fix this issue, we initially considered fixing
individual crashes we observed. However, the changes required to fix
those problems would introduce considerable overhead to the nominal
case. This is not reasonable in order to prevent a crash from occurring
while Asterisk is already shutting down.
Instead, this change makes it so AMI and CLI commands cannot be executed
if Asterisk is being shut down. For AMI, this is absolute. For CLI,
though, certain commands can be registered so that they may be run
during Asterisk shutdown.
Gianluca Merlo [Sat, 19 Mar 2016 12:34:26 +0000 (13:34 +0100)]
config: fix flags in uint option handler
The configuration unsigned integer option handler sets flags for the
parser as if the option should be a signed integer (PARSE_INT32),
leading to errors on "out of range" values. Fix flags (PARSE_UINT32).
A fix to res_pjsip is also present which stops invalid flags from
being passed when registering sorcery object fields for qualify
status.
Walter Doekes [Thu, 24 Mar 2016 12:51:00 +0000 (13:51 +0100)]
musiconhold: Only warn if music class is not found in memory and database.
The log message when a MusicOnHold music class was not found was changed
from debug level to WARNING level in Asterisk 11.19 and 13.5. For those
using realtime musiconhold, this message is wrong because it warns
before checking the database.
This changeset delays the warning until after the database has been
checked.
Walter Doekes [Thu, 24 Mar 2016 10:48:32 +0000 (11:48 +0100)]
core/logging: Fix broken syslog levels on older glibc.
The fix to ASTERISK-25407 introduced the usage of LOG_MAKEPRI. However
this macro is broken in older glibc (< 2.17); it would left-shift the
facility a second time, causing the resultant priority to become
invalid.
The syslog manpage mentions nothing about LOG_MAKEPRI and suggests this:
The priority argument is formed by ORing the facility and the level
values [...].
Matt Jordan [Mon, 29 Feb 2016 01:05:16 +0000 (19:05 -0600)]
main/app: Only look to end of file if ':end' is specified, and not just ':'
There is a little known feature in app_controlplayback that will cause the
specified offset to be used relative to the end of a file if a ':end' is
detected within the filename.
This feature is pretty bad, but okay.
However, a bug exists in this code where a ':' detected in the filename
will cause the end pointer to be non-NULL, even if the full ':end' isn't
specified. This causes us to treat an unspecified offset (0) as being
"start playing from the end of the file", resulting in no file playback
occurring.
This patch fixes this bug by resetting the end pointer if ':end' is not
found in the filename.
Matt Jordan [Sat, 26 Dec 2015 21:29:04 +0000 (15:29 -0600)]
main/file: Add the ability to play media in the media cache
This patch allows applications/APIs that access media through the core file
APIs to play media in the media cache. Prior to determining if a 'filename'
exists, the filename is passed to the media cache's retrieve API call. If
that call succeeds, the local file specified passed back by the API is
opened for streaming. When used in this fashion, the 'filename' is actually
a URI that the media cache process and understand.
Matt Jordan [Wed, 30 Dec 2015 16:52:28 +0000 (10:52 -0600)]
tests/test_http_media_cache: Add unit tests for res_http_media_cache
This patch adds unit tests for res_http_media cache, that covers nominal
creation and retrieval - and through them as well, staleness and deletion
checks. In addition, this patch adds tests that covers the interaction of
various HTTP headers, including Expires, Etag, and Cache-Control.
Matthew Jordan [Thu, 29 Jan 2015 14:38:23 +0000 (14:38 +0000)]
res/res_http_media_cache: Add an HTTP(S) backend for the core media cache
This patch adds a bucket backend for the core media cache that interfaces to a
remote HTTP server. When a media item is requested in the cache, the cache will
query its bucket backends to see if they can provide the media item. If that
media item has a scheme of HTTP or HTTPS, this backend will be invoked.
The backend provides callbacks for the following:
* create - this will always retrieve the URI specified by the provided
bucket_file, and store it in the file specified by the object.
* retrieve - this will pull the URI specified and store it in a temporary
file. It is then up to the media cache to move/rename this file
if desired.
* delete - destroys the file associated with the bucket_file.
* stale - if the bucket_file has expired, based on received HTTP headers from
the remote server, or if the ETag on the server no longer matches
the ETag stored on the bucket_file, the resource is determined to be
stale.
Note that the backend respects the ETag, Expires, and Cache-Control headers
provided by the HTTP server it is querying.
Matt Jordan [Sat, 26 Dec 2015 21:31:26 +0000 (15:31 -0600)]
main/media_cache: Provide an extension on the local file associated with a URI
This patch does the following:
First, it addresses file extension handling in the media cache. The media core
in Asterisk is a bit interesting in that it wants:
* A file to have an extension on it. That extension is used to associate the
file with a defined format module.
* The filename passed to the core to not have an extension on it. This allows
the core to match the available file formats with the format a channel
is capable of handling.
Unfortunately, this makes the current implementation a bit lacking in the media
cache. By default, we do not store the extension of a retrieved URI on the
local file that is created. As a result, the media core does not know what
format the file is, and the file is ignored. Modifying the file outside of the
media core is bad, as we would not be able to update the internal
ast_bucket_file's path.
At the same time, we do not want to pass the extension out in the file_path
parameter in ast_media_cache_retrieve. This parameter is intended to be fed
into the media core; if we passed the extension, all callers would have to
strip it off.
Thus, this patch does the following:
* If there is an extension specified in the URL, we append it to the local
file name (if a preferred file name isn't specified), and we store that
in the local file path.
* The extension, however, is stripped off of the file_path parameter passed
back out of ast_media_cache_retrieve.
Second, this patch causes stale items to be completely removed from the system.
Prior to this patch, sound files could be orphaned due to the bucket
referencing the file being deleted, but the file itself not being removed. This
is now addressed by explicitly calling ast_bucket_file_delete on the
bucket_file when it is deemed to be stale. Note that this only happen when we
know we will attempt to retrieve the resource again.
Finally, this patch changes the AO2 container holding media items to just use
a regular mutex. The usage for this container already assumed it was a plain
mutex, and - given that retrieval of an item can cause it to be replaced in
the container - a mutex makes more sense than a read/write lock.
Matthew Jordan [Sun, 26 Oct 2014 01:21:18 +0000 (01:21 +0000)]
funcs/func_curl: Add the ability for CURL to download and store files
This patch adds a write option to the CURL dialplan function, allowing it to
CURL files and store them locally. The value 'written' to the CURL URL
specifies the location on disk to store the file. As an example:
same => n,Set(CURL(http://1.1.1.1/foo.wav)=/tmp/foo.wav)
Would retrieve the file foo.wav from the remote server and store it in the
/tmp directory.
Due to the potentially dangerous nature of this function call, APIs are
forbidden from using the write functionality unless live_dangerously is set
to True in asterisk.conf.
chan_sip.c: Space after port causes unnecessary resolution attempt
check_via() already skips leading blanks where the sent-by address (with the
optional port) should be placed.
Since RFC 3261 allows for blanks between the port ant the Via parameters:
> https://tools.ietf.org/html/rfc3261#section-20.42
(actually it allows a lot of blanks more ;-)). I just switched from
ast_skip_blanks() to ast_strip() on the local copy of the string.
Gianluca Merlo [Sat, 19 Mar 2016 01:32:51 +0000 (02:32 +0100)]
func_aes: fix misuse of strlen on binary data
The encryption code for AES_ENCRYPT evaluates the length of the data to
be encoded in base64 using strlen. The data is binary, thus the length
of it can be underestimated at the first NULL character.
Reuse the write pointer offset to evaluate it, instead.
Kevin Harwell [Wed, 16 Mar 2016 17:37:01 +0000 (12:37 -0500)]
chan_pjsip: transfers with direct media reinvite has wrong address/port
During a transfer involving direct media a race occurs between when the
transferer channel is swapped out, initiating rtp changes/updates, and the
subsequent reinvites.
When Alice, after speaking with Charlie (Bob is on hold), connects Bob and
Charlie invites are sent to each in order to establish the call between them.
Bob is taken off hold and Charlie is told to have his media flow through
Asterisk. However, if before those invites go out the bridge updates Bob's
and/or Charlie's rtp information with direct media data (i.e. address, port)
then the invite(s) will contain the remote data in the SDP instead of the
Asterisk data.
The race occurs in the native bridge glue code when updating the peer. The
direct_media_address can get set twice before sending out the first invite
during call connection. This can happen because the checking/setting of the
direct_media_address happened in one thread while the sending of the invite(s)
happened in another thread.
This fix removes the race condition by moving the checking/setting of the
direct_media_address to be in the same thread as the sending of the invites(s).
This serializes the checking/setting and sending so they can no longer happen
out of order.
res_pjsip_refer.c: Fix seg fault in process of Refer-to header.
The "Refer-to" header of an incoming REFER request is parsed by
pjsip_parse_uri(). That function requires the URI parameter to be NULL
terminated. Unfortunately, the previous code added the NULL terminator by
overwriting memory that may not be safe. The overwritten memory results
could be benign, memory corruption, or a segmentation fault. Now the URI
is NULL terminated safely by copying the URI to a new chunk of memory with
the correct size to be NULL terminated.
Leif Madsen [Thu, 25 Feb 2016 16:29:05 +0000 (11:29 -0500)]
Add initial support to build Docker images
This work-in-progress is the first step to being able to reliably
build Asterisk containers from the Asterisk source. I'm submitting
this based on feedback gained at AstriDevCon 2015.
Information about how to use this is provided in contrib/docker/README.md
and will result in a local Asterisk container being built right from
your source. I believe this can eventually be automated via
hub.docker.com.
Richard Mudgett [Fri, 11 Mar 2016 18:22:48 +0000 (12:22 -0600)]
chan_sip.c: Fix mwi resub deadlock potential.
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
Richard Mudgett [Thu, 10 Mar 2016 23:01:12 +0000 (17:01 -0600)]
chan_sip.c: Fix registration timeout and expire deadlock potential.
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
Richard Mudgett [Wed, 9 Mar 2016 22:26:26 +0000 (16:26 -0600)]
chan_sip.c: Fix waitid deadlock potential.
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
* Made always run check_pendings() under the scheduler thread so scheduler
ids can be checked safely.
Richard Mudgett [Thu, 10 Mar 2016 18:17:09 +0000 (12:17 -0600)]
chan_sip.c: Fix t38id deadlock potential.
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
Richard Mudgett [Wed, 9 Mar 2016 22:34:53 +0000 (16:34 -0600)]
chan_sip.c: Fix reinviteid deadlock potential.
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
Richard Mudgett [Mon, 7 Mar 2016 19:21:44 +0000 (13:21 -0600)]
chan_sip.c: Fix autokillid deadlock potential.
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
* Fix clearing autokillid in __sip_autodestruct() even though we could
reschedule.
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
* Fix retrans_pkt() to call check_pendings() with both the owner channel
and the private objects locked as required.
* Refactor dialog retransmission packet list to safely remove packet
nodes. The list nodes are now ao2 objects. The list has a ref and the
scheduled entry has a ref.
This patch is part of a series to resolve deadlocks in chan_sip.c.
Stopping a scheduled event can result in a deadlock if the scheduled event
is running when you try to stop the event. If you hold a lock needed by
the scheduled event while trying to stop the scheduled event then a
deadlock can happen. The general strategy for resolving the deadlock
potential is to push the actual starting and stopping of the scheduled
events off onto the scheduler/do_monitor() thread by scheduling an
immediate one shot scheduled event. Some restructuring may be needed
because the code may assume that the start/stop of the scheduled events is
immediate.
Richard Mudgett [Fri, 11 Mar 2016 03:54:03 +0000 (21:54 -0600)]
chan_sip.c: Clear scheduled immediate events on unload.
This patch is part of a series to resolve deadlocks in chan_sip.c.
The reordering of chan_sip's shutdown is to handle any immediate events
that get put onto the scheduler so resources aren't leaked. The typical
immediate events at this time are going to be concerned with stopping
other scheduled events.
This patch is part of a series to resolve deadlocks in chan_sip.c.
Delaying destruction of the chan_sip sip_pvt structures caused the
/channels/chan_sip/test_sip_rtpqos unit test to crash. That test
registers a special test ast_rtp_engine with the rtp engine module. When
the unit test completes it cleans up by unregistering the test
ast_rtp_engine and exits. Since the delayed destruction of the sip_pvt
happens after the unit test returns, the destructor tries to call the rtp
engine destroy callback of the test ast_rtp_engine auto variable which no
longer exists on the stack.
* Change the test ast_rtp_engine auto variable to a static variable. Now
the variable can still exist after the unit test exits so the delayed
sip_pvt destruction can complete successfully.
Joshua Colp [Mon, 14 Mar 2016 13:59:10 +0000 (10:59 -0300)]
build: Add configure check for proto field of PJSIP TLS transport setting.
Older versions of PJSIP do not have the proto field on the TLS transport
setting structure. This change adds a configure check so even if it is
not present we will still be able to build.
George Joseph [Sat, 12 Mar 2016 22:02:20 +0000 (15:02 -0700)]
build_system: Split COMPILE_DOUBLE from DONT_OPTIMIZE
I can't ever recall actually needing the intermediate files or the checking
that a double compile produces. What I CAN remember is every DONT_OPTIMIZE
build needing 3 invocations of gcc instead of 1 just to do the checks and
produce those intermediate files.
Having said that, Richard pointed out that the reason for the double compile
was that there were cases in the past where a submitted patch failed to compile
because the submitter never tried it with the optimizations turned on.
To get the best of both worlds, COMPILE_DOUBLE has been split into its own
option. If DONT_OPTIMIZE is turned on, COMPILE_DOUBLE will also be selected
BUT you can then turn it off if all you need are the debugging symbols. This
way you have to make an informed decision about disabling COMPILE_DOUBLE.
To allow COMPILE_DOUBLE to be both auto-selected and turned off, a new feature
was added to menuselect. The <use> element can now contain an "autoselect"
attribute which will turn the used member on but not create a hard dependency.
The cflags.xml implementation for COMPILE_DOUBLE looks like this...
When DONT_OPTIMIZE is turned on, COMPILE_DOUBLE is turned on because
of the use.
When DONT_OPTIMIZE is turned off, COMPILE_DOUBLE is turned off because
of the depend.
When COMPILE_DOUBLE is turned on, DONT_OPTIMIZE is turned on because
of the depend.
When COMPILE_DOUBLE is turned off, DONT_OPTIMIZE is left as is because
it only uses COMPILE_DOUBLE, it doesn't depend on it.
I also made a few tweaks to the ncurses implementation to move things
left a bit to allow longer descriptions.
George Joseph [Thu, 10 Mar 2016 19:09:13 +0000 (12:09 -0700)]
pjproject: Pass (dont_)optimize flags to pjproject and fix pjsua
The pjproject Makefile now uses the Asterisk optimization flags which
are determined by the setting of the DONT_OPTMIZE menuselect flag.
The Makefile was also restructured so a change to the top level
menuselect.makeopts will result in a rebuild of pjproject.
Also, "--disable-resample" was removed from the pjproject configure
options. Without resample, pjsua (which is used by the testsuite)
can't make audio calls. When it can't, it segfaults.
Walter Doekes [Fri, 11 Mar 2016 22:03:08 +0000 (23:03 +0100)]
app_chanspy: Fix occasional deadlock with ChanSpy and Local channels.
Channel masquerading had a conflict with autochannel locking.
When locking autochannel->channel, the channel is fetched from the
autochannel and then locked. During the fetch, the autochannel -- which
has no locks itself -- can be modified by someone who owns the channel
lock. That means that the value of autochan->channel cannot be trusted
until you hold the lock.
In practice, this caused problems with Local channels getting
masqueraded away while the ChanSpy attempted to get info from that
channel. The old channel which was about to get removed got locked, but
the new (replaced) channel got unlocked (no-op). Because the replaced
channel was now locked (and would never get unlocked), it couldn't get
removed from the channel list in a timely manner, and would now cause
deadlocks when iterating over the channel list.
This change checks the autochannel after locking the channel for changes
to the autochannel. If the channel had been changed, the lock is
reobtained on the new channel.
In theory it seems possible that after this fix, the lock attempt on the
old (wrong) channel can be on an already destroyed lock, maybe causing
a crash. But that hasn't been observed in the wild and is harder induce
than the current deadlock.
Thanks go to Filip Frank for suggesting a fix similar to this and
especially to IRC user hexanol for pointing out why this deadlock was
possible and testing this fix. And to Richard for catching my rookie
while loop mistake ;)
George Joseph [Sun, 6 Mar 2016 20:38:41 +0000 (13:38 -0700)]
res_pjsip: Strip spaces from items parsed from comma-separated lists
Configurations like "aors = a, b, c" were either ignoring everything after "a"
or trying to look up " b". Same for mailboxes, ciphers, contacts and a few
others.