chan_sip: Make autocreated peers send PeerStatus events
Since Stasis has been introduced, an attempt to send AMI messages by an
autocreated peer caused a crash, and all events from autocreated peers were
semi-inadvertently disabled altogether in 0b83761. This change restores the
disabled functionality.
George Joseph [Fri, 22 Apr 2016 22:53:23 +0000 (16:53 -0600)]
res_agi: Prevent run_agi from eating frames it shouldn't
The run_agi function is eating control frames when it shouldn't be. This is
causing issues when an AGI is run from CONNECTED_LINE_SEND_SUB in a blond
transfer.
Alice calls Bob. Bob attended transfers to Charlie but hangs up before Charlie
answers.
Alice gets the COLP UPDATE indicating Charlie but Charlie never gets an UPDATE
and is left thinking he's connected to Bob.
In this case, when CONNECTED_LINE_SEND_SUB runs on Alice's channel and it calls
an AGI, the extra eaten frames prevent CONNECTED_LINE_SEND_SUB from running on
Charlie's channel.
The fix was to accumulate deferrable frames in the "forever" loop instead of
dropping them, and re-queue them just before running the actual agi command
or exiting.
Mark Michelson [Fri, 22 Apr 2016 18:49:50 +0000 (13:49 -0500)]
func_odbc: Use one connection per DSN.
res_odbc was changed in Asterisk 13.8.0 to remove connection management,
opting instead to let unixodbc maintain open connections and return
those to Asterisk as requested.
This was a boon for realtime, since it meant that multiple threads could
potentially run parallel queries since they could each be using their
own database connections.
However, on the user-facing side, func_odbc, there were some inherent
behaviors being relied on that no longer hold true after the change.
One such reported behavior was that MySQL's LAST_INSERTED_ID() works
per-connection. This means that if Asterisk uses separate connections
for every database operation, whereas before it used one connection for
everything, we have broken expectations and functionality.
The fix provided in this patch is to make func_odbc use a single
database connection per DSN. This way, user-facing database usage will
have the same behavior as it did pre-13.8.0. However, realtime, which is
the real workhorse of database interaction, will continue to let
unixodbc manage connections.
Richard Mudgett [Fri, 15 Apr 2016 19:36:59 +0000 (14:36 -0500)]
res_stasis: Handle re-enter stasis bridge with swap channel.
We lose the fact that there is a swap channel if there is one. We
currently wind up rejoining the stasis bridge as a normal join after the
swap channel has already been kicked from the bridge.
This patch preserves the swap channel so the AMI/ARI events can note that
the channel joining the bridge is swapping with another channel. Another
benefit to swaqpping in one operation is if there are any channels that
get lonely (MOH, bridge playback, and bridge record channels). The lonely
channels won't leave before the joining channel has a chance to come back
in under stasis if the swap channel is the only reason the lonely channels
are staying in the bridge.
ASTERISK-25947 #close
Reported by: Richard Mudgett
Richard Mudgett [Tue, 19 Apr 2016 21:58:32 +0000 (16:58 -0500)]
bridge: Hold off more than one imparting channel at a time.
An earlier patch blocked the ast_bridge_impart() call until the channel
either entered the target bridge or it failed. Unfortuantely, if the
target bridge is stasis and the imprted channel is not a stasis channel,
stasis bounces the channel out of the bridge to come back into the bridge
as a proper stasis channel. When the channel is bounced out, that
released the block on ast_bridge_impart() to continue. If the impart was
a result of a transfer, then it became a race to see if the swap channel
would get hung up before the imparted channel could come back into the
stasis bridge. If the imparted channel won then everything is fine. If
the swap channel gets hung up first then the transfer will fail because
the swap channel is leaving the bridge.
* Allow a chain of ast_bridge_impart()'s to happen before any are
unblocked to prevent the race condition described above. When the channel
finally joins the bridge or completely fails to join the bridge then the
ast_bridge_impart() instances are unblocked.
George Joseph [Tue, 19 Apr 2016 22:52:15 +0000 (16:52 -0600)]
res_pjsip_callerid: Clear out display name if id->name is not valid
When create_new_id_hdr creates a new RPID or PAI header, it starts by cloning
the From header, then it overwrites the display name and uri from the channel's
connected.id. If the connected.id.name wasn't valid, create_new_id_hdr was
leaving the display name from the From header in the new RPID or PAI header.
On an attended transfer where the originator had a caller id number set but not
a display name, the re-INVITE to the final transferee had the number of the
originator but the display name of the transferer.
Added a check to clear out the display name in the new header if
connected.id.name was invalid.
Mark Michelson [Mon, 18 Apr 2016 17:12:37 +0000 (12:12 -0500)]
PJSIP: Remove PJSIP parsing functions from uri length validation.
The PJSIP parsing functions provide a nice concise way to check the
length of a hostname in a SIP URI. The problem is that in order to use
those parsing functions, it's required to use them from a thread that
has registered with PJLib.
On startup, when parsing AOR configuration, the permanent URI handler
may not be run from a PJLib-registered thread. Specifically, this could
happen when Asterisk was started in daemon mode rather than
console-mode. If PJProject were compiled with assertions enabled, then
this would cause Asterisk to crash on startup.
The solution presented here is to do our own parsing of the contact URI
in order to ensure that the hostname in the URI is not too long. The
parsing does not attempt to perform a full SIP URI parse/validation,
since the hostname in the URI is what is important.
Mark Michelson [Mon, 18 Apr 2016 22:00:42 +0000 (17:00 -0500)]
res_pjsip_registrar: Fix bad memory-ness with user_agent.
Recent changes to the PJSIP registrar resulted in tests failing due to
missing AOR_CONTACT_ADDED test events. The reason for this was that the
user_agent string had junk values in it, resulting in being unable to
generate the event.
I'm going to be honest here, I have no idea why this was happening. Here
are the steps needed for the user_agent variable to get messed up:
* REGISTER is received
* First contact in the REGISTER results in a contact being removed
* Second contact in the REGISTER results in a contact being added
* The contact, AOR, expiration, and user agent all have to be passed as
format parameters to the creation of a string. Any subset of those
parameters would not be enough to cause the problem.
Looking into what was happening, the thing that struck me as odd was
that the user_agent variable was meant to be set to the value of the
User-Agent SIP header in the incoming REGISTER. However, when removing a
contact, the user_agent variable would be set (via ast_strdupa inside a
loop) to the stored contact's user_agent. This means that the
user_agent's value would be incorrect when attempting to process further
contacts in the incoming REGISTER.
The fix here is to use a different variable for the stored user agent
when removing a contact. Correcting the behavior to be correct also
means the memory usage is less weird, and the issue no longer occurs.
res_pjsip_transport_management: Allow unload to occur.
At shutdown it is possible for modules to be unloaded that wouldn't
normally be unloaded. This allows the environment to be cleaned up.
The res_pjsip_transport_management module did not have the unload
logic in it to clean itself up causing the res_pjsip module to not
get unloaded. As a result the res_pjsip monitor thread kept going
processing traffic and timers when it shouldn't.
Richard Mudgett [Fri, 15 Apr 2016 16:41:49 +0000 (11:41 -0500)]
bridge_channel.c: Ignore role setup failure in channel push.
We have to setup the channel roles after the bridge class push is called
because the bridge class push callback may have set roles on the incoming
channel. Since we have already partially pushed the channel into the
bridge and reversing what we have already done could be problematic, the
only thing we can do is press on to complete pushing the channel into the
bridge.
* Ignore any channel role setup errors after pushing the channel into a
bridge. The channel may behave incorrectly in the bridge but we can no
longer abort the push at this time.
Mark Michelson [Thu, 14 Apr 2016 18:49:35 +0000 (13:49 -0500)]
transport management: Register thread with PJProject.
The scheduler thread that kills idle TCP connections was not registering
with PJProject properly and causing assertions if PJProject was built in
debug mode.
This change registers the thread with PJProject the first time that the
scheduler callback executes.
"Idle" here means that someone connects to us and does not send a SIP
request. PJProject will not automatically time out such connections, so
it's up to Asterisk to do it instead.
When we receive an incoming TCP connection, we will start a timer
(equivalent to transaction timer D) waiting to receive an incoming
request. If we do not receive a request in that timeframe, then we will
shut down the TCP connection.
George Joseph [Tue, 12 Apr 2016 20:41:43 +0000 (14:41 -0600)]
pjproject: Add patch for removing strip of '[]' from header params
From the patch submitted to Teluu on 4/12/2016
<<<<<<<<<
The wholesale stripping of '[]' from header parameters causes issues if
something (like a port) occurs after the final ']'.
'[2001:a::b]' will correctly parse to '2001:a::b'
'[2001:a::b]:8080' will correctly parse to '2001:a::b' but the scanner is left
with ':8080' and parsing stops with a syntax error.
I can't even find a case where stripping the '[]' is a good thing anyway. Even
if you continued to parse and resulted in a string that looks like this...
'2001:a::b:8080', it's not valid.
This came up in Asterisk because Kamailio sends us a Contact with an alias
URI parameter that has an IPv6 address in it like this:
Contact: <sip:1171@127.0.0.1:5080;alias=[2001:1:2::3]~43691~6>
which should be legal but causes a syntax error because of the characters
after the final ']'. Even if it didn't, the '[]' should still not be stripped.
I've run the Asterisk Test Suite for PJSIP (252 tests) many of which are IPv6
enabled. No issues were caused by removing the code that strips the '[]'.
>>>>>>>>>>>
ASTERISK-25123 #close Reported-by: Anthony Messina
Change-Id: I5cb33f4ebf07ee1f2b26d07caae715e2ec65595a
The test_voicemail_notify_endl test checks the end-of-line
characters of an email message to confirm that they are consistent.
The test wrongfully assumed that reading from the email message
into a buffer will always result in more than 1 character being
read. This is incorrect. If only 1 character was read the test
would go outside of the buffer and access other memory causing
a crash.
The test now checks to ensure that 2 or more characters are read
in ensuring the test stays within the buffer.
George Joseph [Fri, 1 Apr 2016 18:30:56 +0000 (12:30 -0600)]
res_pjsip contact: Lock expiration/addition of contacts
Contact expiration can occur in several places: res_pjsip_registrar,
res_pjsip_registrar_expire, and automatically when anyone calls
ast_sip_location_retrieve_aor_contact. At the same time, res_pjsip_registrar
may also be attempting to renew or add a contact. Since none of this was locked
it was possible for one thread to be renewing a contact and another thread to
expire it immediately because it was working off of stale data. This was the
casue of intermittent registration/inbound/nominal/multiple_contacts test
failures.
Now, the new named lock functionality is used to lock the aor during contact
expire and add operations and res_pjsip_registrar_expire now checks the
expiration with the lock held before deleting the contact.
George Joseph [Sun, 10 Apr 2016 19:16:42 +0000 (13:16 -0600)]
pjproject: Add patch to fix Via IPv6 parsing
There's a bug in pjproject's sip_parser where the ":" wasn't correctly
interpreted. This is causing IPv6 addresses in the "received" parameter of the
Via header to cause a syntax check failure.
This patch was submitted to Teluu on 4/10/2016.
ASTERISK-25910 #close Reported-by: Anthony Messina
Change-Id: Ic7e4c4aa14ded61860401ec349f5177568c4d922
George Joseph [Fri, 1 Apr 2016 01:04:29 +0000 (19:04 -0600)]
lock: Add named lock capability
Locking some objects like sorcery objects can be tricky because the underlying
ao2 object may not be the same for all callers. For instance, two threads that
call ast_sorcery_retrieve_by_id on the same aor name might actually get 2
different ao2 objects if the underlying wizard had to rehydrate the aor from a
database. Locking one ao2 object doesn't have any effect on the other even if
those objects had locks in the first place.
Named locks allow access control by keyspace and key strings. Now an "aor"
named "1000" can be locked and any other thread attempting to lock "aor" "1000"
will wait regardless of whether the underlying ao2 object is the same or not.
Mutex and rwlocks are supported.
This capability will initially be used to lock an aor when multiple threads may
be attempting to prune expired contacts from it.
res_pjsip_outbound_publish: Add transport for outbound PUBLISH
The first available transport of the appropriate type is used now.
This patch adds new config option 'transport' for outbound-publish.
If transport is set then outbound PUBLISH requests will use this transport.
app_voicemail/IMAP: IMAP access FATAL error: Out of memory
Sometimes uw-imap function 'mail_fetchbody' returns huge len
which then pass to uw-imap function 'rfc822_base64'.
uw-imap tries to allocate huge memory and abort() on fail.
This patch check the len.
If the len more than max size (128 Mbytes) log error.
This patch also set variables len, newlen to avoid uninizialezed len.
This patch also check pointer returned by rfc822_base64.
Because SQLite doesn't support full ALTER capabilities, alembic scripts
require batch operations. However, that capability wasn't available until
0.7.0 which some distributions haven't reached yet. Therefore, the batch
operations introduced in commit 86d6e44cc (review 2319) have been reverted
and SQLite is unsupported again, for now anyway.
Tested the full upgrade and downgrade on MySQL/Mariadb and Postgresql.
res_pjsip_registrar_expire: Fix race condition at shutdown.
When shutting down, the PJSIP sorcery is destroyed. The registrar
expiration module queries the PJSIP sorcery to determine what
to expire. As there was no synchronization between termination
of the expiration thread and the unloading of the module it was
possible for the thread to try to access the PJSIP sorcery after
it had been destroyed.
This change ensures that the thread is shut down before allowing
the module to be considered unloaded.
Jacek Konieczny [Wed, 6 Apr 2016 13:01:47 +0000 (15:01 +0200)]
frame.c: Copy the whole subclass in ast_frdup().
The problem is ast_frdup() does not copy whole frame.subclass for voice,
video and image frames, only the format is copied. For video frames, the
subclass structure contains the .frame_ending flag used to put the RTP
marker where it needs to be.
Some SIP devices indicate hold/unhold using deferred SDP reinvites. In
other words, they provide no SDP in the reinvite.
A typical transaction that starts hold might look something like this:
* Device sends reinvite with no SDP
* Asterisk sends 200 OK with SDP indicating sendrecv on streams.
* Device sends ACK with SDP indicating sendonly on streams.
At this point, PJMedia's SDP negotiator saves Asterisk's local state as
being recvonly.
Now, when the device attempts to unhold, it again uses a deferred SDP
reinvite, so we end up doing the following:
* Device sends reinvite with no SDP
* Asterisk sends 200 OK with SDP indicating recvonly on streams
* Device sends ACK with SDP indicating sendonly on streams
The problem here is that Asterisk offered recvonly, and by RFC 3264's
rules, if an offer is recvonly, the answer has to be sendonly. The
result is that the device is not taken off hold.
What is supposed to happen is that Asterisk should indicate sendrecv in
the 200 OK that it sends. This way, the device has the freedom to
indicate sendrecv if it wants the stream taken off hold, or it can
continue to respond with sendonly if the purpose of the reinvite was
something else (like a session timer refresher).
The fix here is to alter the SDP negotiator's state when we receive a
reinvite with no SDP. If the negotiator's state is currently in the
recvonly or inactive state, then we alter our local state to be
sendrecv. This way, we allow the device to indicate the stream state as
desired.
ASTERISK-25854 #close
Reported by Robert McGilvray
George Joseph [Mon, 28 Mar 2016 04:33:29 +0000 (22:33 -0600)]
config: Allow filters when appending to a category
In sorcery based config files where there are multiple categories with the same
name, you can't use the (+) operator to reliably append to a category because
config.c stops looking when it finds the first one with the same name.
Example:
[1000]
type = endpoint
[1000]
type = aor
[1000](+)
authenticate_qualify = yes
This config will fail because config.c appends authenticate_qualify to the
first category it finds, the endpoint, and that's not valid for endpoint.
Solution:
The capability to find a category that contains a certain variable already
exists so the only real change was to parse anything after the '+' that's not a
comma, as a filter string.
[1000]
type = endpoint
[1000]
type = aor
[1000](+type=aor)
authenticate_qualify = yes
This now works as expected.
Although the following example doesn't make any sense for pjsip, you can even
specify multiple filters:
[1000](+type=aor&qualify_frequency=10)
ASTERISK-25868 #close Reported-by: Nick Repin
Change-Id: I10773da4c79db36fbf1993961992af63d3441580
George Joseph [Sat, 26 Mar 2016 04:22:34 +0000 (22:22 -0600)]
stringfields: Refactor to allow fields to be added to the end of structures
String fields are great, except that you can't add new ones without breaking
ABI compatibility because it shifts down everything else in the structure.
The only alternative is to add your own char * field to the end of the
structure and manage the memory yourself which isn't ideal, especially since
you then can't use the OPT_STRINGFIELD_T type.
Background:
The reason string fields had to be declared inside the
AST_DECLARE_STRING_FIELDS block was to facilitate iteration over all declared
fields for initialization, compare and copy. Since AST_DECLARE_STRING_FIELDS
declared the pool, then the fields, then the manager, you could use the offsets
of the pool and manager and iterate over the sequential addresses in between to
access the fields. The actual pool, field allocation and field set operations
don't actually care where the field is. It's just iteration over the fields
that was the problem.
Solution: Extended String Fields
An extended string field is one that is declared outside the
AST_DECLARE_STRING_FIELDS block but still (anywhere) inside the parent
structure. Other than using AST_STRING_FIELD_EXTENDED instead of
AST_STRING_FIELD, it looks the same as other string fields. It's storage comes
from the pool and it participates in string field compare and copy operations
peformed on the parent structure. It's also a valid target for the
OPT_STRINGFIELD_T aco option type.
Implementation:
To keep track of the extended fields and make sure that ABI isn't broken, the
existing embedded_pool pointer in the manager structure was repurposed to be a
pointer to a separate header structure that contains the embedded_pool pointer
plus a vector of fields. The length of the manager structure didn't change and
the embedded_pool pointer isn't used in the macros, only the stringfields C
code. A side benefit of this is that changing the header structure in the
future won't break ABI.
ast_string_fields_init initializes the normal string fields and appends them to
the vector, and subsequent calls to ast_string_field_init_extended initialize
and append the extended fields. Cleanup, ast_string_fields_cmp, and
ast_string_fields_copy can now work on the vector instead of sequentially
traversing the addresses between the pool and manager.
The total size of a structure using string fields didn't change, whether using
extended fields or not, nor have the offsets of any structure members, either
inside the original block or outside. Adding an extended field to the end of a
structure is the same as adding a char *.
Details:
The stringfield C code was pulled out from utils.c and into stringfields.c.
It just made sense.
Additional work was done in ast_string_field_init and
ast_calloc_with_stringfields to handle the allocation of the new header
structure and the vector, and the associated cleanup. In the process some
additional NULL pointer checking was added.
A lot of work was done in stringfields.h since the logic for compare and copy
is there. Documentation was added as well as somne additional NULL checking.
The ability to call ast_calloc_with_stringfields with a number of structures
greater than 1 never really worked. Well, the calloc worked but there was no
way to access the additional structures or clean them up. It was agreed that
there was no use case for requesting more than 1 structure so an ast_assert
was added to prevent it and the iteration code removed.
Testing:
The stringfield unit tests were updated to test both normal and extended
fields. Tests for ast_string_field_ptr_set_by_fields and
ast_calloc_with_stringfields were also added.
As an ABI test, 13 was compiled from git and the res_pjsip_* modules, except
res_pjsip itself, saved off. The patch was then added and a full compile and
install was performed. Then the older res_pjsip_* moduled were copied over the
installed versions so res_pjsip was new and the rest were old. No issues.
contact->aor, which is a char * at the end of contact, was then changed to an
extended string field and a recompile and reinstall was performed, again
leaving stock versions of the the res_pjsip_* modules. Again, no issues with
the res_pjsip_* modules using the old stringfield implementation and with
contact->aor as a char *, and res_pjsip itself using the new stringfield
implementation and contact->aor being an extended string field.
Finally, several existing string fields were converted to extended string
fields to test OPT_STRINGFIELD_T. Again, no issues.
check_installed_debs wasn't handling virtual packages like libsrtp-dev and
libresample-dev and on multiarch systems it was accidentally filtering out all
packages if any :i386 packages were found instead of just filtering out the
:i386 packages themselves.
George Joseph [Wed, 30 Mar 2016 23:34:42 +0000 (17:34 -0600)]
pjproject_bundled: Fix use of LDCONFIG for shared library link creation
LDCONFIG apparently isn't set to something sane on all systems so the creation
of the shared library links fails. Instead of just testing for non-blank,
main/Makefile now checks that LDCONFIG is actually executable and reverts to
LN if it isn't.
This applies to both libasteriskpj and libasteriskssl.
Thanks to 'abelbeck' for pointing out that the issue was LDCONFIG.
ASTERISK-25873 #close Reported-by: Hans van Eijsden
Change-Id: I25b76379bc637726ec044b2c0e709b56b3701729
Richard Mudgett [Tue, 29 Mar 2016 23:06:24 +0000 (18:06 -0500)]
res_ari: Cannot get control also means channel is unavailable.
The only caller of ari_bridges_play_found() has this note:
If ari_bridges_play_found fails because the channel is unavailable for
playback, The channel will be removed from the playback list soon. We can
keep trying to get channels from the list until we either get one that
will work or else there isn't a channel for this bridge anymore, in which
case we'll revert to ari_bridges_play_new.
Richard Mudgett [Tue, 29 Mar 2016 18:47:08 +0000 (13:47 -0500)]
res_stasis: Add control ref to playback and recording structs.
The stasis_app_playback and stasis_app_recording structs need to have a
struct stasis_app_control ref. Other threads can get a reference to the
playback and recording structs from their respective global container.
These other threads can then use the control pointer they contain after
the control struct has gone.
* Add control ref to stasis_app_playback and stasis_app_recording structs.
With the refs added, the control command queue can now have a circular
control reference which will cause the control struct to never get
released if the control's command queue is not flushed when the channel
leaves the Stasis application. Also the command queue needs better
protection from adding commands if the control->is_done flag is set.
Richard Mudgett [Mon, 28 Mar 2016 23:10:40 +0000 (18:10 -0500)]
res_stasis: Fix crash on a hanging up channel.
* Give the struct stasis_app_control ao2 object a ref to the channel held
in the object. Now the channel will still be around if a thread needs to
post a stasis message instead of crash because the topic was destroyed.
* Moved stopping any lingering silence generator out of the struct
stasis_app_control destructor and made it a part of exiting the Stasis
application. Who knows which thread the destructor will be called under
so it cannot affect the channel's silence generator. Not only was the
channel unprotected when the silence generator was stopped, stasis may no
longer even control the channel.
George Joseph [Wed, 30 Mar 2016 17:38:47 +0000 (11:38 -0600)]
res_pjsip_mwi: Allow subscribe to vm access extension as an alias
Background:
If your extension is 1000 and the voicemail access extension is 1571 and you
dial 1571, usually a dialplan rule calls voicemailmain with your extension and
you are placed directly in your mailbox. Therefore most admins program the
voicemail (or other speed dial) button on their phones to the access extension.
Some phones (Snom at least) use whatever is programmed there to also subscribe
for MWI and so can't dial one number and subscribe to another. This works fine
in chan_sip because chan_sip completely ignores the user portion of the
SUBSCRIBE message request URI. If it can match the peer, is subscribes to the
peer's mailbox. The user could be set to anything or nothing and you'd still
get subscribed to your mailbox.
Issue:
chan_pjsip actually uses the user portion of the URI to find an aor and its
mailboxes. Therefore a subscribe to 1571 results in a 404. Sure, you can
create an aor for 1571 but you certainly can't add your entire voicemail
system's mailboxes to it and everyone would get notified of every MWI.
Solution:
When an MWI subscribe comes in and an aor can't be found that matches the
resource directly, check the resource against the endpoint's aors. If an aor
is found that has a voicemail_extension that matches the resource, use it.
ASTERISK-25865 Reported-by: Ross Beer
Change-Id: I770ea185f751f1ada888fafb4b452115f1c06e9e