Kevin Harwell [Mon, 1 Feb 2021 21:24:25 +0000 (15:24 -0600)]
AST-2021-002: Remote crash possible when negotiating T.38
When an endpoint requests to re-negotiate for fax and the incoming
re-invite is received prior to Asterisk sending out the 200 OK for
the initial invite the re-invite gets delayed. When Asterisk does
finally send the re-inivite the SDP includes streams for both audio
and T.38.
This happens because when the pending topology and active topologies
differ (pending stream is not in the active) in the delayed scenario
the pending stream is appended to the active topology. However, in
the fax case the pending stream should replace the active.
This patch makes it so when a delay occurs during fax negotiation,
to or from, the audio stream is replaced by the T.38 stream, or vice
versa instead of being appended.
Further when Asterisk sent the re-invite with both audio and T.38,
and the endpoint responded with a declined T.38 stream then Asterisk
would crash when attempting to change the T.38 state.
This patch also puts in a check that ensures the media state has a
valid fax session (associated udptl object) before changing the
T.38 state internally.
Joshua C. Colp [Fri, 5 Feb 2021 11:26:02 +0000 (07:26 -0400)]
pjsip: Make modify_local_offer2 tolerate previous failed SDP.
If a remote side is broken and sends an SDP that can not be
negotiated the call will be torn down but there is a window
where a second 183 Session Progress or 200 OK that is forked
can be received that also attempts to negotiate SDP. Since
the code marked the SDP negotiation as being done and complete
prior to this it assumes that there is an active local and remote
SDP which it can modify, while in fact there is not as the SDP
did not successfully negotiate. Since there is no local or remote
SDP a crash occurs.
This patch changes the pjmedia_sdp_neg_modify_local_offer2
function to no longer assume that a previous SDP negotiation
was successful.
Kevin Harwell [Mon, 19 Oct 2020 22:21:57 +0000 (17:21 -0500)]
AST-2020-001 - res_pjsip: Return dialog locked and referenced
pjproject returns the dialog locked and with a reference. However,
in Asterisk the method that handles this decrements the reference
and removes the lock prior to returning. This makes it possible,
under some circumstances, for another thread to free said dialog
before the thread that created it attempts to use it again. Of
course when the thread that created it tries to use a freed dialog
a crash can occur.
This patch makes it so Asterisk now returns the newly created
dialog both locked, and with an added reference. This allows the
caller to de-reference, and unlock the dialog when it is safe to
do so.
In the case of a new SIP Invite the lock, and reference are now
held for the entirety of the new invite handling process.
Otherwise it's possible for the dialog, or its dependent objects,
like the transaction, to disappear. For example if there is a TCP
transport error.
Ben Ford [Mon, 2 Nov 2020 16:29:31 +0000 (10:29 -0600)]
AST-2020-002 - res_pjsip: Stop sending INVITEs after challenge limit.
If Asterisk sends out an INVITE and receives a challenge with a
different nonce value each time, it will continuously send out INVITEs,
even if the call is hung up. The endpoint must be configured for
outbound authentication for this to occur. A limit has been set on
outbound INVITEs so that, once reached, Asterisk will stop sending
INVITEs and the transaction will terminate.
The ast_sip_dialog_get_session function returns the session
with reference count increased. This was not taken into
account and was causing sessions to remain around when they
should not be.
When a channel joins a bridge, we do topology change requests on all
existing channels to add the new participant to them. However the
announcer channel will return an error because it doesn't support
topology in the first place. Unfortunately, there doesn't seem to be a
reliable way to tell if the error is expected or not so the error is
ignored for all channels. If the request fails on a "real" channel,
that channel just won't get the new participant's video.
George Joseph [Fri, 11 Sep 2020 16:09:55 +0000 (10:09 -0600)]
res_pjsip_session: Fix issue with COLP and 491
The recent 491 changes introduced a check to determine if the active
and pending topologies were equal and to suppress the re-invite if they
were. When a re-invite is sent for a COLP-only change, the pending
topology is NULL so that check doesn't happen and the re-invite is
correctly sent. Of course, sending the re-invite sets the pending
topology. If a 491 is received, when we resend the re-invite, the
pending topology is set and since we didn't request a change to the
topology in the first place, pending and active topologies are equal so
the topologies-equal check causes the re-invite to be erroneously
suppressed.
This change checks if the topologies are equal before we run the media
state resolver (which recreates the pending topology) so that when we
do the final topologies-equal check we know if this was a topology
change request. If it wasn't a change request, we don't suppress
the re-invite even though the topologies are equal.
When both Asterisk and a UA send re-invites at the same time, both
send 491 "Transaction in progress" responses to each other and back
off a specified amount of time before retrying. When Asterisk
prepares to send its re-invite, it sets up the session's pending
media state with the new topology it wants, then sends the
re-invite. Unfortunately, when it received the re-invite from the
UA, it partially processed the media in the re-invite and reset
the pending media state before sending the 491 losing the state it
set in its own re-invite.
Asterisk also was not tracking re-invites received while an existing
re-invite was queued resulting in sending stale SDP with missing
or duplicated streams, or no re-invite at all because we erroneously
determined that a re-invite wasn't needed.
There was also an issue in bridge_softmix where we were using a stream
from the wrong topology to determine if a stream was added. This also
caused us to erroneously determine that a re-invite wasn't needed.
Regardless of how the delayed re-invite was triggered, we need to
reconcile the topology that was active at the time the delayed
request was queued, the pending topology of the queued request,
and the topology currently active on the session. To do this we
need a topology resolver AND we need to make stream named unique
so we can accurately tell what a stream has been added or removed
and if we can re-use a slot in the topology.
Summary of changes:
* bridge_softmix:
* We no longer reset the stream name to "removed" in
remove_all_original_streams(). That was causing multiple streams
to have the same name and wrecked the checks for duplicate streams.
* softmix_bridge_stream_sources_update() was checking the old_stream
to see if it had the softmix prefix and not considering the stream
as "new" if it did. If the stream in that slot has something in it
because another re-invite happened, then that slot in old might
have a softmix stream but the same stream in new might actually
be a new one. Now we check the new_stream's name instead of
the old_stream's.
* stream:
* Instead of using plain media type name ("audio", "video", etc) as
the default stream name, we now append the stream position to it
to make it unique. We need to do this so we can distinguish multiple
streams of the same type from each other.
* When we set a stream's state to REMOVED, we no longer reset its
name to "removed" or destroy its metadata. Again, we need to
do this so we can distinguish multiple streams of the same
type from each other.
* res_pjsip_session:
* Added resolve_refresh_media_states() that takes in 3 media states
and creates an up-to-date pending media state that includes the changes
that might have happened while a delayed session refresh was in the
delayed queue.
* Added is_media_state_valid() that checks the consistency of
a media state and returns a true/false value. A valid state has:
* The same number of stream entries as media session entries.
Some media session entries can be NULL however.
* No duplicate streams.
* A valid stream for each non-NULL media session.
* A stream that matches each media session's stream_num
and media type.
* Updated handle_incoming_sdp() to set the stream name to include the
stream position number in the name to make it unique.
* Updated the ast_sip_session_delayed_request structure to include both
the pending and active media states and updated the associated delay
functions to process them.
* Updated sip_session_refresh() to accept both the pending and active
media states that were in effect when the request was originally queued
and to pass them on should the request need to be delayed again.
* Updated sip_session_refresh() to call resolve_refresh_media_states()
and substitute its results for the pending state passed in.
* Updated sip_session_refresh() with additional debugging.
* Updated session_reinvite_on_rx_request() to simply return PJ_FALSE
to pjproject if a transaction is in progress. This stops us from
creating a partial pending media state that would be invalid later on.
* Updated reschedule_reinvite() to clone both the current pending and
active media states and pass them to delay_request() so the resolver
can tell what the original intention of the re-invite was.
Ben Ford [Mon, 31 Aug 2020 16:14:20 +0000 (11:14 -0500)]
Bridging: Use a ref to bridge_channel's channel to prevent crash.
There's a race condition with bridging where a bridge can be torn down
causing the bridge_channel's ast_channel to become NULL when it's still
needed. This particular case happened with attended transfers, but the
crash occurred when trying to publish a stasis message. Now, the
bridge_channel is locked, a ref to the ast_channel is obtained, and that
ref is passed down the chain.
George Joseph [Wed, 19 Aug 2020 12:37:23 +0000 (06:37 -0600)]
scope_trace: Added debug messages and added additional macros
The SCOPE_ENTER and SCOPE_EXIT* macros now print debug messages
at the same level as the scope level. This allows the same
messages to be printed to the debug log when AST_DEVMODE
isn't enabled.
Also added a few variants of the SCOPE_EXIT macros that will
also call ast_log instead of ast_debug to make it easier to
use scope tracing and still print error messages.
George Joseph [Fri, 14 Aug 2020 16:13:33 +0000 (10:13 -0600)]
logger.c: Added a new log formatter called "plain"
Added a new log formatter called "plain" that always prints
file, function and line number if available (even for verbose
messages) and never prints color control characters. It also
doesn't apply any special formatting for verbose messages.
Most suitable for file output but can be used for other channels
as well.
You use it in logger.conf like so:
debug => [plain]debug
console => [plain]error,warning,debug,notice,pjsip_history
messages => [plain]warning,error,verbose
George Joseph [Fri, 28 Aug 2020 14:34:09 +0000 (08:34 -0600)]
ast_coredumper: Fix issues with naming
If you run ast_coredumper --tarball-coredumps in the same directory
as the actual coredump, tar can fail because the link to the
actual coredump becomes recursive. The resulting tarball will
have everything _except_ the coredump (which is usually what
you need)
There's also an issue that the directory name in the tarball
is the same as the coredump so if you extract the tarball the
directory it creates will overwrite the coredump.
So:
* Made the link to the coredump use the absolute path to the
file instead of a relative one. This prevents the recursive
link and allows tar to add the coredump.
* The tarballed directory is now named <coredump>.output instead
of just <coredump> so if you expand the tarball it won't
overwrite the coredump.
George Joseph [Thu, 23 Jul 2020 19:47:25 +0000 (13:47 -0600)]
res_pjsip_session: Ensure reused streams have correct bundle group
When a bundled stream is removed, its bundle_group is reset to -1.
If that stream is later reused, the bundle parameters on session
media need to be reset correctly it could mistakenly be rebundled
with a stream that was removed and never reused. Since the removed
stream has no rtp instance, a crash will result.
pjsip: Add timer refactor patch and timer 0 cancellation.
There have been numerous timer issues over the years which
resulted in Teluu doing a major refactor of the timer
implementation whereby the timer entries themselves are not
trusted. This change backports this refactor which has
been shown to resolve timer crashes.
Additionally another patch has been backported to prevent
timer entries with an internal id of 0 from being canceled.
This would result in the invalid timer id of 0 being placed
into the timer heap causing issues. This is also a backport.
George Joseph [Sun, 5 Jul 2020 23:51:04 +0000 (17:51 -0600)]
Scope Trace: Make it easier to trace through synchronous tasks
Tracing through synchronous tasks was a little troublesome because
the new thread's stack counter reset to 0. This change allows
a synchronous task to set its trace level to be the same as the
thread that pushed the task. For now, the task's level has to be
passed in the task's data structure but a future enhancement to the
taskprocessor subsystem could automatically set the trace level
of the servant to be that of the caller.
This doesn't really make sense for async tasks because you never
know when they're going to run anyway.
George Joseph [Tue, 30 Jun 2020 13:56:34 +0000 (07:56 -0600)]
Scope Trace: Add some new tracing macros and an ast_str helper
Created new SCOPE_ functions that don't depend on RAII_VAR. Besides
generating less code, the use of the explicit SCOPE_EXIT macros
capture the line number where the scope exited. The RAII_VAR
versions can't do that.
* SCOPE_ENTER(level, ...): Like SCOPE_TRACE but doesn't use
RAII_VAR and therefore needs needs one of...
* SCOPE_EXIT(...): Decrements the trace stack counter and optionally
prints a message.
* SCOPE_EXIT_EXPR(__expr, ...): Decrements the trace stack counter,
optionally prints a message, then executes the expression.
SCOPE_EXIT_EXPR(break, "My while got broken\n");
* SCOPE_EXIT_RTN(, ...): Decrements the trace stack counter,
optionally prints a message, then returns without a value.
SCOPE_EXIT_RTN("Bye\n");
* SCOPE_EXIT_RTN_VALUE(__return_value, ...): Decrements the trace
stack counter, optionally prints a message, then returns the value
specified.
SCOPE_EXIT_RTN_VALUE(rc, "Returning with RC: %d\n", rc);
Create an ast_str helper ast_str_tmp() that allocates a temporary
ast_str that can be passed to a function that needs it, then frees
it. This makes using the above macros easier. Example:
SCOPE_ENTER(1, Format Caps 1: %s Format Caps 2: %s\n",
ast_str_tmp(32, ast_format_cap_get_names(cap1, &STR_TMP),
ast_str_tmp(32, ast_format_cap_get_names(cap2, &STR_TMP));
The calls to ast_str_tmp create an ast_str of the specified initial
length which can be referenced as STR_TMP. It then calls the
expression, which must return a char *, ast_strdupa's it, frees
STR_TMP, then returns the ast_strdupa'd string. That string is
freed when the function returns.
George Joseph [Thu, 14 May 2020 18:24:19 +0000 (12:24 -0600)]
Scope Tracing: A new facility for tracing scope enter/exit
What's wrong with ast_debug?
ast_debug is fine for general purpose debug output but it's not
really geared for scope tracing since it doesn't present its
output in a way that makes capturing and analyzing flow through
Asterisk easy.
How is scope tracing better?
Scope tracing uses the same "cleanup" attribute that RAII_VAR
uses to print messages to a separate "trace" log level. Even
better, the messages are indented and unindented based on a
thread-local call depth counter. When output to a separate log
file, the output is uncluttered and easy to follow.
Here's an example of the output. The leading timestamps and
thread ids are removed and the output cut off at 68 columns for
commit message restrictions but you get the idea.
Scope isn't limited to functions any more than RAII_VAR is. You
can also see entry and exit from "if", "for", "while", etc blocks.
There is also an ast_trace() macro that doesn't track entry or
exit but simply outputs a message to the trace log using the
current indent level. The deepest message in the sample
(chan_pjsip.c:3245) was used to indicate which "case" in a
"select" was executed.
How do you use it?
More documentation is available in logger.h but here's an overview:
* Configure with --enable-dev-mode. Like debug, scope tracing
is #ifdef'd out if devmode isn't enabled.
* Add a SCOPE_TRACE() call to the top of your function.
* Set a logger channel in logger.conf to output the "trace" level.
* Use the CLI (or cli.conf) to set a trace level similar to setting
debug level... CLI> core set trace 2 res_pjsip.so
Summary Of Changes:
* Added LOG_TRACE logger level. Actually it occupies the slot
formerly occupied by the now defunct "event" level.
* Added core asterisk option "trace" similar to debug. Includes
ability to specify global trace level in asterisk.conf and CLI
commands to turn on/off and set levels. Levels can be set
globally (probably not a good idea), or by module/source file.
* Updated sample asterisk.conf and logger.conf. Tracing is
disabled by default in both.
* Added __ast_trace() to logger.c which keeps track of the indent
level using TLS. It's #ifdef'd out if devmode isn't enabled.
* Added ast_trace() and SCOPE_TRACE() macros to logger.h.
These are all #ifdef'd out if devmode isn't enabled.
Why not use gcc's -finstrument-functions capability?
gcc's facility doesn't allow access to local data and doesn't
operate on non-function scopes.
Known Issues:
The only know issue is that we currently don't know the line
number where the scope exited. It's reported as the same place
the scope was entered. There's probably a way to get around it
but it might involve looking at the stack and doing an 'addr2line'
to get the line number. Kind of like ast_backtrace() does.
Not sure if it's worth it.
George Joseph [Mon, 6 Jul 2020 15:57:18 +0000 (09:57 -0600)]
frame.c: Make debugging easier
* ast_frame_subclass2str() and ast_frame_type2str() now return
a pointer to the buffer that was passed in instead of void.
This makes it easier to use these functions inline in
printf-style debugging statements.
* Added many missing control frame entries in
ast_frame_subclass2str.
When dealing with a lot of video streams on WebRTC
the resulting SDPs can grow to be quite large. This
effectively doubles the maximum size to allow more
streams to exist.
The res_http_websocket module has also been changed
to use a buffer on the session for reading in packets
to ensure that the stack space usage is not excessive.
Joshua C. Colp [Fri, 26 Jun 2020 10:18:55 +0000 (07:18 -0300)]
res_pjsip: Apply AOR outbound proxy to static contacts.
The outbound proxy for an AOR was not being applied to
any statically configured Contacts. This resulted in the
OPTIONS requests being sent to the wrong target.
This change sets the outbound proxy on statically configured
contacts once the AOR configuration is done being
applied.
Joshua C. Colp [Wed, 17 Jun 2020 08:58:44 +0000 (05:58 -0300)]
res_pjsip_session: Preserve label on incoming re-INVITE.
When a re-INVITE is received we create a new set of
streams that are then swapped in as the active streams.
We did not preserve the SDP label from the previous
streams, resulting in the label getting lost.
This change ensures that if an SDP label is present
on the previous stream then it is set on the new stream.
Joshua C. Colp [Wed, 3 Jun 2020 16:47:42 +0000 (13:47 -0300)]
core_unreal / core_local: Add multistream and re-negotiation.
When requesting a Local channel the requested stream topology
or a converted stream topology will now be placed onto the
resulting channels.
Frames written in on streams will now also preserve the stream
identifier as they are queued on the opposite channel.
Finally when a stream topology change is requested it is
immediately accepted and reflected on both channels. Each
channel also receives a queued frame to indicate that the
topology has changed.
Joshua C. Colp [Mon, 8 Jun 2020 11:27:53 +0000 (08:27 -0300)]
res_rtp_asterisk: Don't assume setting retrans props means to enable.
The "value" passed in when setting an RTP property determines
whether it should be enabled or disabled. The RTP send and
receive retrans props did not examine this to know if the
buffers should be enabled. They assumed they always should be.
This change makes it so that the "value" passed in is
respected.
Joshua C. Colp [Wed, 10 Jun 2020 17:11:16 +0000 (14:11 -0300)]
bridge_softmix: Add additional old states for adding new source.
There are three states that an old stream can be in to allow
becoming a source stream in a new stream:
1. Removed
2. Inactive
3. Sendonly
This change adds the two missing ones, inactive and sendonly,
so if a stream transitions from those to a state where they are
providing video to Asterisk we properly re-negotiate the other
participants.
George Joseph [Mon, 8 Jun 2020 00:02:00 +0000 (18:02 -0600)]
app_confbridge: Plug ref leak of bridge channel with send_events
When send_events is enabled for a user, we were leaking a reference
to the bridge channel in confbridge_manager.c:send_message(). This
also caused the bridge snapshot to not be destroyed.
Kevin Harwell [Wed, 3 Jun 2020 17:06:44 +0000 (12:06 -0500)]
Compiler fixes for gcc 10
This patch fixes a few compile warnings/errors that now occur when using gcc
10+.
Also, the Makefile.rules check to turn off partial inlining in gcc versions
greater or equal to 8.2.1 had a bug where it only it only checked against
versions with at least 3 numbers (ex: 8.2.1 vs 10). This patch now ensures
any version above the specified version is correctly compared.
George Joseph [Thu, 30 Apr 2020 15:56:03 +0000 (09:56 -0600)]
app_voicemail: Add workaround for a gcc 10 issue with -Wrestrict
The gcc 10 -Wrestrict option was causing "overlap" errors when
snprintf was copying one char[256] structure member to another
char[256] member in the same structure.
Using ast_alloca instead of declaring the structure inline
solves the issue.
Here's a link to the "enhancement":
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-10/msg00570.html
fax: Fix crashes in PJSIP re-negotiation scenarios.
This change fixes a few re-negotiation issues
uncovered with fax.
1. The fax support uses its own mechanism for
re-negotiation by conveying T.38 information in
its own frames. The new support for re-negotiating
when adding/removing/changing streams was also
being triggered for this causing multiple re-INVITEs.
The new support will no longer trigger when
transitioning between fax.
2. In off-nominal re-negotiation cases it was
possible for some state information to be left
over and used by the next re-negotiation. This
is now cleared.
Alexander Traud [Sun, 12 Apr 2020 14:53:50 +0000 (16:53 +0200)]
BuildSystem: Search for Python/C API when possibly needed only.
The Python/C API is used only if the Test Framework was enabled in Asterisk
'make menuselect'. The Test Framework is available only if the Developer Mode
was enabled in Asterisk './configure --enable-dev-mode'. And that Python/C API
is used only if the PJProject was found and not disabled in Asterisk; the user
did not go for './configure --without-pjproject'.
Furthermore, because version 2 of that Python/C API is required (currently) and
because some platforms do not offer a generic version 2, the script searches
for 2.7 explicitly as well.
To avoid version mismatch between the Python/C API and the Python environment,
the script searches for the latter in the same versions, in the same the order
as well. Because this Python/C API is just for (some) Asterisk contributors,
the script also goes for the Python 3 environment as a last resort for all
other Asterisk users. This allows 'make full' even on minimal installations of
Ubuntu 18.04 LTS and newer.
Because the Python/C API is Asterisk contributor specific, the Python packages
are removed from the script './contrib/scripts/install_prereq' as this script
is intended for Asterisk users. Asterisk contributors have to install much more
packages in any case, like:
sudo apt install autoconf automake git git-review python2.7-dev
In practice it has been seen that some users come
close to our maximum ICE candidate count of 32.
In case people have gone over this increases the
count to 64, giving ample room.
res_rtp_asterisk: Resolve loop when receive buffer is flushed
When the receive buffer was flushed by a received packet while it
already contained a packet with the same sequence number, Asterisk
never left the while loop which tried to order the packets.
This change makes it so if the packet is in the receive buffer it
is retrieved and freed allowing the buffer to empty.
res_rtp_asterisk: Free payload when error on insertion to data buffer
When the ast_data_buffer_put rejects to add a packet, for example because
the buffer already contains a packet with the same sequence number, the
payload will never be freed, resulting in a memory leak.
The data buffer will now return an error if this situation occurs
allowing the caller to free the payload. The res_rtp_asterisk module
has also been updated to do this.
So in order to remain backwards compatible we need to detect this API
change, and adjust accordingly. The simplest is to notice that the
bfd_get_section_size and bfd_get_section_vma MACROs are no longer
defined, and define then onto the new API. The alternative is to litter
the code with a number of #ifdef #else #endif splatters right through
the code.
Kevin Harwell [Tue, 31 Mar 2020 17:52:44 +0000 (12:52 -0500)]
channel: write to a stream on multi-frame writes
If a frame handling routine returns a list of frames (vs. a single frame)
those frames are never passed to a tech's write_stream handler even if one is
available. For instance, if a codec translation occurred and that codec
returned multiple frames then those particular frames were always only sent
to the tech's "write" handler. If that tech (pjsip for example) was stream
capable then those frames were essentially ignored. Thus resulting in bad
audio.
This patch makes it so the "write_stream" handler is appropriately called
for all cases, and for all frames if available.
When an outgoing channel is created a list of formats may
optionally be provided which is used as a request that the
formats be used if possible. If an endpoint is not configured
for any of the formats we ignore this request and use what is
configured. This has the side effect of also including other
stream types (such as video) that were not present in the
requested formats.
This change makes it so that the intention of the request is
preserved - that is if only an audio format is requested then
even if there is no joint audio format between the request and
the configuration we will still only place an audio stream in
the outgoing call.
Kevin Harwell [Tue, 17 Mar 2020 20:54:25 +0000 (15:54 -0500)]
ast_coredumper: add Asterisk information dump
This patch makes it so ast_coredumper now outputs the following information to
a *-info.txt file when processing a core file:
asterisk version and "built by" string
BUILD_OPTS
system start, and last reloaded date/time
taskprocessor list
equivalent of "bridge show all"
equivalent of "core show channels verbose"
Also a slight modification was made when trying to obtain the pid(s) of a
running Asterisk. If it fails to retrieve any it now reports an error.
Joshua C. Colp [Thu, 19 Mar 2020 13:48:39 +0000 (10:48 -0300)]
res_pjsip_session: Don't restrict non-audio default streams to sendrecv.
The state of the default audio stream is used for hold/unhold so we
restrict it to sendrecv as the core does not handle when it changes as
a result of hold/unhold.
This restriction does not apply to other media types though so we now
only restrict it to audio. This allows the other default streams to
store their state at all values, and not just sendrecv and removed.
George Joseph [Wed, 4 Mar 2020 21:45:40 +0000 (14:45 -0700)]
CI: Create generic jenkinsfile
This is a generic jenkinsfile to build Asterisk and optionally
perform one or more of the following:
* Publish the API docs to the wiki
* Run the Unit tests
* Run Testsuite Tests
This job can be triggered manually from Jenkins or be triggered
automatically on a schedule based on a cron string.
Joshua C. Colp [Thu, 20 Feb 2020 17:33:42 +0000 (17:33 +0000)]
res_rtp_asterisk: Improve video performance in certain networks.
The receive buffer will now grow if we end up flushing the
receive queue after not receiving the expected packet in time.
This is done in hopes that if this is encountered again the
extra buffer size will allow more time to pass and any missing
packets to be received.
The send buffer will now grow if we are asked for packets and
can't find them. This is done in hopes that the packets are
from the past and have simply been expired. If so then in
the future with the extra buffer space the packets should be
available.
Sequence number cycling has been handled so that the
correct sequence number is calculated and used in
various places, including for sorting packets and
for determining if a packet is old or not.
NACK sending is now more aggressive. If a substantial number
of missing sequence numbers are added a NACK will be sent
immediately. Afterwards once the receive buffer reaches 25%
a single NACK is sent. If the buffer continues to grow and
reaches 50% or greater a NACK will be sent for each received
future packet to aggressively ask the remote endpoint to
retransmit.
Given a scenario where session refreshes occur close to
each other while another is finishing it was possible for
the session refreshes to occur out of order. It was
also possible for session refreshes to be delayed for
quite some time if a session refresh did not result in
a topology change.
For the out of order session refreshes the first session
refresh would be queued due to a transaction in progress.
This transaction would then finish. When finished a
separate task to process the delayed requests queue
would be queued for handling. A second refresh would
be requested internally before this delayed request
queued task was processed. As no transaction was in
progress this session refresh would be immediately
handled before the queued session refresh.
The code will now check if any delayed requests exist
before allowing a session refresh to immediately occur.
If any exist then the session refresh is queued.
For the delayed session refreshes if a session refresh
did not result in a topology change the attempt would
be immediately stopped and no other delayed requests would
be processed.
The code will now go through the entire delayed requests
queue until a delayed request results in a request
actually being sent.
Joshua C. Colp [Sun, 5 Jan 2020 00:11:20 +0000 (00:11 +0000)]
bridging: Add better support for adding/removing streams.
This change adds support to bridge_softmix to allow the addition
and removal of additional video source streams. When such a change
occurs each participant is renegotiated as needed to reflect the
update. If another video source is added then each participant
gets another source. If a video source is removed then it is
removed from each participant. This functionality allows you to
have both your webcam and screenshare providing video if you
desire, or even more streams. Mapping has been changed to use
the topology index on the source channel as a unique identifier
for outgoing participant streams, this will never change and
provides an easy way to establish the mapping.
The bridge_simple and bridge_native_rtp modules have also been
updated to renegotiate when the stream topology of a party changes
allowing the same behavior to occur as added to bridge_softmix.
If a screen share is added then the opposite party is renegotiated.
If that screen share is removed then the opposite party is
renegotiated again.
Some additional fixes are also included in here. Stream state is
now conveyed in SDP so sendonly/recvonly/inactive streams can
be requested. Removed streams now also remove previous state
from themselves so consumers don't get confused.
George Joseph [Thu, 13 Feb 2020 19:39:58 +0000 (12:39 -0700)]
res_pjsip_outbound_registration: Fix SRV failover on timeout
In order to retry outbound registrations for some situations, we
need access to the tdata from the original request. For instance,
for 401/407 responses we need it to properly construct the
subsequent request with the authentication. We also need it if
we're iterating over a DNS SRV response record set so we can skip
entries we've already tried.
We've been getting the tdata from the server response rdata and
transaction but that only works for the failures where there was
actually a response (4XX, 5XX, etc). For timeouts there's no
response and therefore no rdata or transaction from which to get
the tdata. When processing a single A/AAAA record for a server,
this wasn't an issue as we just retried that same server after the
retry timer expired. If we got an SRV record set for the server
though, without the state from the tdata, we just kept trying the
first entry in the set repeatedly instead of skipping to the next
one in the list.
* Added a "last_tdata" member to the client state structure to keep
track of the sent tdata.
* Updated registration_client_send() to save the tdata it used into
the client_state.
* Updated sip_outbound_registration_response_cb() to use the tdata
saved in client_state when we don't get a response from the
server. We still use the tdata from the transaction when we DO
get a response from the server so we can properly handle 4XX
responses where our new request depends on it.
General note on timeouts:
Although res_pjsip_outbound_registration skips to the next record
immediately when a timeout occurs during SRV set traversal, it's
pjproject that determines how long to wait before a timeout is
declared. As with other SIP message types, pjproject will continue
trying the same server at an interval specified by "timer_t1" until
"timer_b" expires. Both of those timers are set in the pjsip.conf
"system" section.
Kevin Harwell [Thu, 13 Feb 2020 21:08:10 +0000 (15:08 -0600)]
res_rtp_asterisk: bad audio (static) due to incomplete dtls/srtp setup
There was a race condition between client initiated DTLS setup, and handling
of server side ice completion that caused the underlying SSL object to get
cleared during DTLS initialization. If this happened Asterisk would be left
in a partial DTLS setup state. RTP packets were sent and received, but were
not being encrypted and decrypted. This resulted in no audio, or static.
Specifically, this occurred when '__rtp_recvfrom' was processing the handshake
sequence from the client to the server, and then 'ast_rtp_on_ice_complete'
gets called from another thread and clears the SSL object when calling the
'dtls_perform_setup' function. The timing had to be just right in the sense
that from the external SSL library perspective SSL initialization completed
(rtp recv), Asterisk clears/resets the SSL object (ice done), and then checks
to see if SSL is intialized (rtp recv). Since it was cleared, Asterisk thinks
it is not finished, thus not completing 'dtls_srtp_setup'.
This patch removes calls to 'dtls_perform_setup', which clears the SSL object,
in 'ast_rtp_on_ice_complete'. When ice completes, there is no reason to clear
the underlying SSL object. If an ice candidate changes a full protocol level
renegotiation occurs. Also, in the case of bundled ICE candidates are reused
when a stream is added. So no real reason to have to clear, and reset in this
instance.
Also, this patch adds a bit of extra logging to aid in diagnosis of any future
problems.
George Joseph [Wed, 5 Feb 2020 16:37:17 +0000 (09:37 -0700)]
Asterisk Certified 16.8 Preparation
* Updated .gitreview default branch to certified/16.8
* Updated .version to certified/16.8
* Set all extended support modules to be disabled by default
Joshua C. Colp [Tue, 4 Feb 2020 14:18:13 +0000 (10:18 -0400)]
res_rtp_asterisk: Don't produce transport-cc if no packets.
The code assumed that when the transport-cc feedback
function was called at least one packet will have been
received. In practice this isn't always true, so now
we just reschedule the sending and do nothing.
George Joseph [Mon, 3 Feb 2020 16:24:58 +0000 (09:24 -0700)]
message.c: Add option to suppress the Message channel AMI and ARI events
In order to reduce the amount of AMI and ARI events generated,
the global "Message/ast_msg_queue" channel can be set to suppress
it's normal channel housekeeping events such as "Newexten",
"VarSet", etc. This can greatly reduce load on the manager
and ARI applications when the Digium Phone Module for Asterisk
is in use. To enable, set "hide_messaging_ami_events" in
asterisk.conf to "yes" In Asterisk versions <18, the default
is "no" preserving existing behavior. Beginning with
Asterisk 18, the option will default to "yes".
NOTE: This change does not affect UserEvents or the ARI
TextMessageReceived events.
* Added the "hide_messaging_ami_events" option to asterisk.conf.
* Changed message.c to set the AST_CHAN_TP_INTERNAL property on
the "Message/ast_msg_queue" channel if the option is set in
asterisk.conf. This suppresses the reporting of the events.
George Joseph [Mon, 3 Feb 2020 16:24:58 +0000 (09:24 -0700)]
message.c: Add option to suppress the Message channel AMI and ARI events
In order to reduce the amount of AMI and ARI events generated,
the global "Message/ast_msg_queue" channel can be set to suppress
it's normal channel housekeeping events such as "Newexten",
"VarSet", etc. This can greatly reduce load on the manager
and ARI applications when the Digium Phone Module for Asterisk
is in use. To enable, set "hide_messaging_ami_events" in
asterisk.conf to "yes" In Asterisk versions <18, the default
is "no" preserving existing behavior. Beginning with
Asterisk 18, the option will default to "yes".
NOTE: This change does not affect UserEvents or the ARI
TextMessageReceived events.
* Added the "hide_messaging_ami_events" option to asterisk.conf.
* Changed message.c to set the AST_CHAN_TP_INTERNAL property on
the "Message/ast_msg_queue" channel if the option is set in
asterisk.conf. This suppresses the reporting of the events.
Kevin Harwell [Mon, 27 Jan 2020 17:44:45 +0000 (11:44 -0600)]
res_stasis: trigger cleanup after update
The cleanup code in stasis shuts down applications if they are in a deactivated
state, and no longer have explicit subscriptions. When registering an app the
cleanup code was running before calling 'update'. When it should be executed
after 'update' since a call to register may re-activate the app. We don't want
it to shutdown before the 'update' otherwise the app won't be re-activated,
or registered.
This patch makes it so the cleanup code is executed post 'update'.
Kevin Harwell [Mon, 27 Jan 2020 18:01:15 +0000 (12:01 -0600)]
stasis/app: don't lock an app before a call to send
Calling 'app_send' eventually calls the app's message handler. It's possible
for a handler to obtain a lock on another object, and then need/want to lock
the app object. If the caller of 'app_send' locks the app object prior to
calling then there's a potential for a deadlock, if another thread calls
'app_send' without locking.
This patch makes it so 'app_send' is not called with the app object locked in
the section of code doing such.
Joshua C. Colp [Tue, 28 Jan 2020 15:18:45 +0000 (15:18 +0000)]
res_pjsip_pubsub: Increment persistence data ref when recreating.
Each subscription needs to have a reference to the persisted data
for it, as well as the main JSON contained within the tree. When
recreating a subscription this did not occur and they both shared
the same reference.
George Joseph [Wed, 22 Jan 2020 18:56:38 +0000 (11:56 -0700)]
cdr.c: Set event time on party b when leaving a parking bridge
When Alice calls Bob and Bob does a blind transfer to Charlie,
Bob's bridge leave event generates a finalize on both the party_a
and party_b CDRs but while the party_a CDR has the correct end time
set from the event time, party_b's leg did not. This caused that
CDR's end time to be equal to the answered time and resulted in a
billsec of 0.
* We now pass the bridge leave message event time to
cdr_object_party_b_left_bridge_cb() and set it on that CDR before
calling cdr_object_finalize() on it.
NOTE: This issue affected transfers using chan_sip most of the
time but also occasionally affected chan_pjsip probably due to
message timing.