Joshua Colp [Tue, 31 Jan 2017 17:17:50 +0000 (17:17 +0000)]
res_pjsip: Handle invocation of callback on outgoing request when error occurs.
There are some error cases in PJSIP when sending a request that will
result in the callback for the request being invoked. The code did not
handle this case and assumed on every error case that the callback was not
invoked.
The code has been changed to check whether the callback has been invoked
and if so to absorb the error and treat it as a success.
George Joseph [Wed, 25 Jan 2017 12:50:43 +0000 (05:50 -0700)]
debug_utilities: Add ast_logescalator
The escalator works by creating a set of startup commands in cli.conf
that set up logger channels and issue the debug commands for the
subsystems specified. If asterisk is running when it is executed,
the same commands will be issued to the running instance. The original
cli.conf is saved before any changes are made and can be restored by
executing '$prog --reset'.
The log output will be stored in...
$astlogdir/message.$uniqueid
$astlogdir/debug.$uniqueid
$astlogdir/dtmf.$uniqueid
$astlogdir/fax.$uniqueid
$astlogdir/security.$uniqueid
$astlogdir/pjsip_history.$uniqueid
$astlogdir/sip_history.$uniqueid
Some minor tweaks were made to chan_sip, and res_pjsip_history
so their history output could be send to a log channel as packets
are captured.
A minor tweak was also made to manager so events are output to verbose
when "manager set debug on" is issued.
Martin Tomec [Fri, 9 Dec 2016 18:23:37 +0000 (19:23 +0100)]
app_queue: Ensure member is removed from pending when hanging up.
In some cases member is added to pending_members, and the channel
is hung up before any extension state change. So the member would
stay in pending_members forever. So when we call do_hang, we
should also remove member from pending.
Richard Mudgett [Sat, 21 Jan 2017 03:13:34 +0000 (21:13 -0600)]
PJPROJECT logging: Fix detection of max supported log level.
The mechanism used for detecting the maximum log level compiled into the
linked pjproject did not work. The API call simply stores the requested
level into an integer and does no range checking. Asterisk was assuming
that there was range checking and limited the new value to the allowable
range. To get the actual maximum log level compiled into the linked
pjproject we need to get and save off the initial set log level from
pjproject. This is the maximum log level supported.
* Get and save off the initial log level setting before altering it to the
desired level on startup. This has to be done by a macro rather than
calling a core function to avoid incorrectly linking pjproject.
* Split the initial log level warning messages to warn if the linked
pjproject cannot support the requested startup level and if it is too low
to get the pjproject buildopts for "pjproject show buildopts".
* Adjust the CLI "pjproject set log level" to check the saved max log
level and to generate normal output messages instead of a warning message.
George Joseph [Thu, 19 Jan 2017 15:05:36 +0000 (08:05 -0700)]
ari: Implement 'debug all' and request/response logging
The 'ari set debug' command has been enhanced to accept 'all' as an
application name. This allows dumping of all apps even if an app
hasn't registered yet. To accomplish this, a new global_debug global
variable was added to res/stasis/app.c and new APIs were added to
set and query the value.
'ari set debug' now displays requests and responses as well as events.
This required refactoring the existing debug code.
* The implementation for 'ari set debug' was moved from stasis/cli.{c,h}
to ari/cli.{c,h}, and stasis/cli.{c,h} were deleted.
* In order to print the body of incoming requests even if a request
failed, the consumption of the body was moved from the ari stubs
to ast_ari_callback in res_ari.c and the moustache templates were
then regenerated. The body is now passed to ast_ari_invoke and then
on to the handlers. This results in code savings since that template
was inserted multiple times into all the stubs.
An additional change was made to the ao2_str_container implementation
to add partial key searching and a sort function. The existing cli
code assumed it was already there when it wasn't so the tab completion
was never working.
George Joseph [Mon, 23 Jan 2017 15:10:50 +0000 (08:10 -0700)]
pjproject_bundled: Fix setting max log level
An earlier attempt to prevent pjsua from spitting out an extra 6795
lines of debug output every time the testsuite called it was also
turning off the ability for asterisk to output debug info when it
needed to. This patch reverts the earlier fix and instead adds
a pjproject patch that sets the startup log level to 1 for pjsua
pjsystest and the pjsua python binding. This is an asterisk-only
patch that does not affect pjproject functionality and will not be
submitted upstream.
George Joseph [Fri, 13 Jan 2017 17:03:15 +0000 (10:03 -0700)]
debug_utilities: Create ast_loggrabber
ast_loggrabber gathers log files from customizable search patterns,
optionally converts POSIX timestamps to a readable format and
tarballs the results.
George Joseph [Tue, 3 Jan 2017 21:14:09 +0000 (14:14 -0700)]
pjproject_bundled: Compile pjsua with max log level = 2
A while back, we changed config_site.h to set PJ_LOG_MAX_LEVEL = 6.
This allowed us to control the log level better from inside Asterisk.
An unfortunate side effect of this was that the pjsua binary and
python bindings were also compiled with log level set to 6 so whenever
a testsuite test that uses pjsua runs, it spits out 6795 lines of
debug in an instant even before the test starts. I believe this
overruns the Jenkins capture buffer and prevents the test from
properly terminating. In turn, this results in the testsuite just
hanging until the job is killed. It's more frequent on the higher
end agents because they can spit out the messages faster.
Unfortunately, the messages are all spit out before we have control
of the python pj.Lib instance where we can set logging levels so the
only alternative was to actually compile pjsua and _pjsua.so with an
overridden PJ_LOG_MAX_LEVEL. Although defining a lower max level was
done in the Makefile, the define in config_site.h had to be wrapped
with "#ifndef" so the change would take effect.
George Joseph [Sun, 18 Dec 2016 21:23:17 +0000 (14:23 -0700)]
pjproject_bundled: Make build single threaded
There were just too many issues in various environments with
multi threaded building of pjproject. It doesn't really speed
things up anyway since asterisk is already being compiled in
parallel.
Richard Mudgett [Thu, 24 Nov 2016 00:27:54 +0000 (18:27 -0600)]
PJPROJECT logging: Made easier to get available logging levels.
Use of the new logging is as simple as issuing the new CLI command or
setting the new pjproject.conf option.
Other options that can affect the logging are how you have the pjproject
log levels mapped to Asterisk log types in pjproject.conf and if you have
configured Asterisk to log the DEBUG type messages. Altering the
pjproject.conf level mapping shouldn't be necessary for most installations
as the default mapping is sensible. Configuring Asterisk to log the DEBUG
message type is standard practice for collecting debug information.
* Added CLI "pjproject set log level" command to dynamically adjust the
maximum pjproject log message level.
* Added CLI "pjproject show log level" command to see the currently set
maximum pjproject log message level.
* Added pjproject.conf startup section "log_level" option to set the
initial maximum pjproject log message level so all messages could be
captured from initialization.
* Set PJ_LOG_MAX_LEVEL to 6 to compile in all defined logging levels into
bundled pjproject. Pjproject will use the currently set run time log
level to determine if a log message is generated just like Asterisk
verbose and debug logging levels.
* In log_forwarder(), made always log enabled and mapped pjproject log
messages. DEBUG mapped log messages are no longer gated by the current
Asterisk debug logging level.
* Removed RAII_VAR() from res_pjproject.c:get_log_level().
George Joseph [Wed, 11 Jan 2017 00:10:39 +0000 (17:10 -0700)]
debug_utilities: Create the ast_coredumper utility
This utility allows easy manipulation of asterisk coredumps.
* Configurable search paths and patterns for existing coredumps
* Can generate a consistent coredump from the running instance
* Can dump the lock_infos table from a coredump
* Dumps backtraces to separate files...
- thread apply 1 bt full -> <coredump>.thread1.txt
- thread apply all bt -> <coredump>.brief.txt
- thread apply all bt full -> <coredump>.full.txt
- lock_infos table -> <coredump>.locks.txt
* Can tarball corefiles and optionally delete them after processing
* Can tarball results files and optionally delete them after processing
* Converts ':' in coredump and results file names '-' to facilitate
uploading. Jira for instance, won't accept file names with colons
in them.
Tested on Fedora24+, Ubuntu14+, Debian6+, CentOS6+ and FreeBSD9+[1].
[1] For *BSDs, the "devel/gdb" package might have to be installed to
get a recent gdb. The utility will check all instances of gdb
it finds in $PATH and if one isn't found that can run python, it
prints a friendly error.
Joshua Colp [Thu, 22 Dec 2016 22:00:58 +0000 (22:00 +0000)]
chan_pjsip: Use session for retrieving CHANNEL() information.
The CHANNEL() dialplan function implementation for PJSIP allows
querying of PJSIP specific information. This used the channel
passed in to get the PJSIP session and associated information.
It is possible for this channel to be masqueraded and end
up as a different channel type by the time the information
request is actually acted upon.
This change retrieves the PJSIP session safely and accesses
data from it (including channel). This provides a guarantee
that the session and channel will not be altered when the
request is being acted upon.
Richard Mudgett [Fri, 23 Dec 2016 18:10:40 +0000 (12:10 -0600)]
bridge_native_rtp.c: Fix native rtp bridge data race.
native_rtp_bridge_compatible() didn't lock the bridge channels before
checking the channels for native bridging ability. As a result, one of
the channel's native format capabilities structure got replaced out from
under the native bridge check. Use of a stale pointer to freed memory
causes bad things to happen.
MALLOC_DEBUG, DO_CRASH, and the
tests/channels/pjsip/transfers/blind_transfer/caller_direct_media
testsuite test caught this.
* Add missing channel locking in native_rtp_bridge_compatible().
ast_rtp_remote_address_set() could pass an uninitialized 'us' parameter to
ast_ouraddrfor(). If ast_ouraddrfor() returns an error then the 'us'
parameter may not get initialized. Thus when the code tries to save the
'us' parameter to the local address we could try to copy a ridiculous
sized memory buffer and segfault.
* Made pass an initialized 'us' parameter to ast_ouraddrfor().
Richard Mudgett [Wed, 21 Dec 2016 23:54:42 +0000 (17:54 -0600)]
chan_rtp.c: Fix uninitialized memory crash.
unicast_rtp_request() could pass an uninitialized 'us' parameter to
ast_ouraddrfor(). If ast_ouraddrfor() returns an error then the 'us'
parameter may not get initialized. Thus when the code tries to save the
'us' parameter to the local address we could try to copy a ridiculous
sized memory buffer and segfault.
* Made pass an initialized 'us' parameter to ast_ouraddrfor() and abort
the UnicastRTP channel request if it fails.
George Joseph [Tue, 13 Dec 2016 20:06:34 +0000 (13:06 -0700)]
res_sorcery_memory_cache: Change an error to a debug message
When a sorcery user calls ast_sorcery_delete on an object that
may have already expired from the cache, res_sorcery_memory_cache
spits out an ERROR. Since this can happen frequently and validly when
an inbound registration expires after the cache entry expired, the
errors are unnecessary and misleading. Changed to a debug/1.
HCOLON = *( LOWCTL / SP ) ":" SWS
LOWCTL = %x00-1F ; CTL without DEL
This discrepancy meant that SIP proxies in front of Asterisk with
chan_sip could pass on unknown headers with \x00-\x1F in them, which
would be treated by Asterisk as a different (known) header. For
example, the "To\x01:" header would gladly be forwarded by some proxies
as irrelevant, but chan_sip would treat it as the relevant "To:" header.
Those relying on a SIP proxy to scrub certain headers could mistakenly
get unexpected and unvalidated data fed to Asterisk.
This change fixes so chan_sip only considers SP/HTAB as valid tokens
before the colon, making it agree on the headers with other speakers of
SIP.
Joshua Colp [Tue, 15 Nov 2016 00:18:21 +0000 (00:18 +0000)]
res_format_attr_opus: Fix crash when fmtp contains spaces.
When an opus offer or answer was received that contained an
fmtp line with spaces between the attributes the module would
fail to properly parse it and crash due to recursion.
This change makes the module handle the space properly and
also removes the recursion requirement.
George Joseph [Tue, 6 Dec 2016 20:54:25 +0000 (13:54 -0700)]
res_pjsip_registrar: AMI Add RegistrationInboundContactStatuses command
The PJSIPShowRegistrationsInbound AMI command was just dumping out
all AORs which was pretty useless and resource heavy since it had
to get all endpoints, then all aors for each endpoint, then all
contacts for each aor.
PJSIPShowRegistrationInboundContactStatuses sends ContactStatusDetail
events which meets the intended purpose of the other command and has
significantly less overhead. Also, some additional fields that were
added to Contact since the original creation of the ContactStatusDetail
event have been added to the end of the event.
For compatibility purposes, PJSIPShowRegistrationsInbound is left
intact.
Richard Mudgett [Tue, 6 Dec 2016 22:45:38 +0000 (16:45 -0600)]
Bundled pjproject: Fix finding SIP transactions.
Occasionally SIP message transactions are not found when they should be.
In the particular case an incoming INVITE transaction is CANCELed but the
INVITE transaction cannot be found so a 481 response is returned for the
CANCEL. The problematic calls have a '_' character in the Via branch
parameter.
The problem is in the pjproject PJ_HASH_USE_OWN_TOLOWER feature's code.
The problem with the "own tolower" code is that it does not calculate the
same hash value as when the pj_tolower() function is used. The "own
tolower" code will erroneously modify the ASCII characters '@', '[', '\\',
']', '^', and '_'. Calls to pj_hash_calc_tolower() can use the
PJ_HASH_USE_OWN_TOLOWER substitute algorithm when enabled. Calls to
pj_hash_get_lower(), pj_hash_set_lower(), and pj_hash_set_np_lower() call
find_entry() which never uses the PJ_HASH_USE_OWN_TOLOWER algorithm. As a
result you may not be able to find a hash tabled entry because the
calculated hash values would differ.
George Joseph [Tue, 6 Dec 2016 18:06:45 +0000 (11:06 -0700)]
pjproject_bundled: Fix missing inclusion of symbols
Added back in a -g3, and an -O3 when DONT_OPTIMIZE is not set, to
the CFLAGS. Not sure how they went missing.
Also fixed an uninstall problem where we weren't removing the
symlink from libasteriskpj.so.2 to libasteriskpj.so. While I was
there, I fixed it for libasteriskssl as well.
The recent change that made frame deferral into an API had a behavior
change to it. When frame deferral was completed, we would take all of
the deferred frames and queue them all onto the channel in one call to
ast_queue_frame_head(). Before frame deferral was API-ized, places that
performed manual frame deferral would actually take each deferred frame
and queue them onto the channel.
This change in behavior caused the confbridge_recording test to start
failing consistently. Without going too crazily deep into the details,
a channel was getting "stuck" in an ast_safe_sleep(). An AMI redirect
was attempting to break it out of the sleep, but because there were more
frames in the channel read queue than expected, the channel ended up
being unable to break from its sleep loop.
By restoring the behavior of individual frame queuing after deferral,
the test starts passing again.
Note, this points to a potential underlying issue pointing to an
"unbalance" that can occur when queuing multiple frames at once,
and so a follow-up issue is being created to investigate that
possibility.
George Joseph [Mon, 21 Nov 2016 15:40:59 +0000 (08:40 -0700)]
build: Backport addition of librt check to configure.ac
A while back, a master-only change was made to check for librt which
should probably have been cherry-picked to 13 at that time. Sometime
between then and now, part of that change did make it into 13 but it
was incomplete and non-functional. This patch backports the rest
of the librt check and allows the link of libasteriskpj to use the
results.
George Joseph [Thu, 17 Nov 2016 02:24:08 +0000 (19:24 -0700)]
build: Various OpenBSD issues
OpenBSD's 'find' doesn't take the -delete argument so you have to pipe
through 'xargs rm -rf'.
'echo -e' doesn't like \t starting a line. It just prints 't' which
causes the libasteriskpj.exports file to be garbage. They were just
cosmetic so they were removed.
librt doesn't exist so the link of libasteriskpj.so fails. It's not
actually needed for linux anyway so -lrt was removed from the link.
res_rtp_asterisk was failing to load because of an undefined
DTLS_method. '|| defined(LIBRESSL_VERSION_NUMBER)' was added to the #if
so DTLSv1_method is used instead.
Mark Michelson [Wed, 16 Nov 2016 21:42:39 +0000 (15:42 -0600)]
res_format_attr_opus: Fix fmtp generation.
res_format_attr_opus assumed that the string being passed into it was
empty. It tried to determine if the only thing it had written was
a=fmtp:<num>
And if it had, it would reset the string. Its calculation was off when
working with chan_sip, though. chan_sip passes the entire built SDP
rather than an empty string. This resulted in always putting an empty
fmtp line in the SDP.
Richard Mudgett [Tue, 15 Nov 2016 22:23:35 +0000 (16:23 -0600)]
codec_opus: Fix warning when Opus negotiated but codec_opus not loaded.
When Opus is negotiated but not loaded, the log is spammed with messages
because the system does not know how to calculate the number of samples in
a frame.
* Suppress the warning by supplying a function that assumes 20ms of
samples in the frame. For pass through support it doesn't really seem to
matter what number of samples is returned anyway.
Richard Mudgett [Mon, 14 Nov 2016 20:36:52 +0000 (14:36 -0600)]
res_pjsip_outbound_authenticator_digest.c: Fix memory pool leak.
Responding to authentication challenges leaks PJSIP memory pools.
The leak was introduced with a pjproject 2.5.5 API change.
https://trac.pjsip.org/repos/ticket/1929 changed the API usage of
pjsip_auth_clt_init() to require the new API pjsip_auth_clt_deinit() to
clean up cached authentication allocations that get allocated with
pjsip_auth_clt_reinit_req().
George Joseph [Tue, 15 Nov 2016 18:01:04 +0000 (11:01 -0700)]
file.c/__ast_file_read_dirs: Fix issues on filesystems without d_type
One of the code paths in __ast_file_read_dirs will only get executed if
the OS doesn't support dirent->d_type OR if the filesystem the
particular file is on doesn't support it. So, while standard Linux
systems support the field, some filesystems like XFS do not. In this
case, we need to call stat() to determine whether the directory entry
is a file or directory so we append the filename to the supplied
directory path and call stat. We forgot to truncate path back to just
the directory afterwards though so we were passing a complete file name
to the callback in the dir_name parameter instead of just the directory
name.
The logic has been re-written to only create a full_path if we need to
call stat() or if we need to descend into another directory.
Matt Jordan [Mon, 14 Nov 2016 21:57:08 +0000 (15:57 -0600)]
pjproject: Use a much higher limit for PJ_ICE_MAX_CHECKS
The PJ_ICE_MAX_CHECKS constant is used by pjproject to determine how
many pairs of local/remote candidates will be made. If for some reason
we reach this upper bound, ICE will generally fail and no media will
flow between the browser and Asterisk.
This patch makes PJ_ICE_MAX_CHECKS set to the total possible number of
pairs of candidates we'd theoretically allow, which is
PJ_ICE_MAX_CAND^2. Prior to this patch, we simply multiplied
PJ_ICE_MAX_CAND by two; on systems with multiple interfaces (I blame
Docker), this is far too low to allow WebRTC calls to succeed.
Setting this to be PJ_ICE_MAX_CAND^2 allowed WebRTC calls to succeed
even when the system Asterisk was running on had quite a few virtual
interfaces.
Matt Jordan [Mon, 14 Nov 2016 21:32:14 +0000 (15:32 -0600)]
apps/app_echo: Only relay a single video source change frame
In 9785e8d0, app_echo was updated to relay video source updates to the
channel for the purposes of displaying video in WebRTC tests.
Unfortunately, this can cause a Kafkaesque nightmare if two or more
Local channels are in a bridge together where their ends are in
app_echo. When this situation occurs, a video update sent into app_echo
will cause the video update to be relayed to the other Local channels,
causing another round of video updates, etc. In not much time at all,
the channel length queues will be overwhelmed, channel alert pipes will
fail, and all hell will break loose as Asterisk merrily continues to
throw more video update requests onto the channels.
This patch updates app_echo to *only* relay a single video update. Once
a video update has been made, all further video updates are dropped.
This meets the intended purpose of the original patch: if we get a video
update and we're in app_echo, go ahead and ask the sender to update
themselves. However, once we've got that video stream sync'd up, don't
keep spamming the world.
Matt Jordan [Tue, 8 Nov 2016 16:11:41 +0000 (10:11 -0600)]
res/ari/resource_bridges: Add the ability to manipulate the video source
In multi-party bridges, Asterisk currently supports two video modes:
* Follow the talker, in which the speaker with the most energy is shown
to all participants but the speaker, and the speaker sees the
previous video source
* Explicitly set video sources, in which all participants see a locked
video source
Prior to this patch, ARI had no ability to manipulate the video source.
This isn't important for two-party bridges, in which Asterisk merely
relays the video between the participants. However, in a multi-party
bridge, it can be advantageous to allow an external application to
manipulate the video source.
This patch provides two new routes to accomplish this:
(1) setVideoSource: POST /bridges/{bridgeId}/videoSource/{channelId}
Sets a video source to an explicit channel
(2) clearVideoSource: DELETE /bridges/{bridgeId}/videoSource
Removes any explicit video source, and sets the video mode to talk
detection
George Joseph [Mon, 14 Nov 2016 18:16:03 +0000 (11:16 -0700)]
cli: Fix ast_el_read_char to work with libedit >= 3.1
Libedit 3.1 is not build with unicode on as a default and so the
prototype for the el_gets callback changed from expecting a char buffer
to accepting a wchar buffer. If ast_el_read_char isn't changed,
the cli reads garbage from teh terminal.
Added a configure test for (*el_rfunc_t)(EditLine *, wchar_t *) and
updated ast_el_read_char to use the HAVE_ define to detemrine whether
to use char or wchar.
* Don't hold the req_wrapper lock too long in endpt_send_request(). We
could block the PJSIP monitor thread if the timeout timer expires.
sip_get_tpselector_from_endpoint() does a sorcery access that could take
awhile accessing a database. pjsip_endpt_send_request() might take awhile
if selecting a transport.
* Shorten the time that the req_wrapper lock is held in the callback
functions.