Kevin Harwell [Tue, 3 May 2016 20:43:16 +0000 (15:43 -0500)]
res_pjsip_outbound_publish: state potential dropped on reloads/realtime fetches
When reloading, or fetching realtime data, if the "apply" failed for any
numerous reasons the current state object would not be maintained. This
potentially resulted in publishes being stopped for some states/clients when
they should not have been.
This patch makes it so the current state object is kept upon any type of reload/
fetch failures.
Kevin Harwell [Thu, 5 May 2016 16:37:37 +0000 (11:37 -0500)]
res_pjsip_authenticator_digest: Don't use source port in nonce verification
From the issue reporter:
"res_pjsip_outbound_authenticator_digest builds a nonce that is a hash of
the timestamp, the source address, the source port, a server UUID that is
calculated at startup, and the authentication realm.
Rather than caching nonces that we create, we instead attempt to re-calculate
the nonce when receiving an incoming request with authentication. We then
compare the re-calculated nonce to the incoming nonce, and if they don't match,
then authentication has failed early.
The problem is that it is possible, especially when using TCP, to receive two
requests from the same endpoint but have differing source ports for those
requests. Asterisk itself commonly will use different source ports for
outbound TCP requests."
This patch removes the source port dependency when building the nonce.
Jaco Kroon [Wed, 4 May 2016 07:40:55 +0000 (09:40 +0200)]
app_confbridge: Add a regcontext option for confbridge bridge profiles.
This patch allows for having app_confbridge register the name of the
conference as an extension into a specific context, similar to
regcontext for chan_sip. This variant is not quite as involved as the
one in chan_sip and doesn't allow for multiple contexts or custom
extensions, you can only specify the context and the conference name
will always be used as the extension to register.
George Joseph [Mon, 9 May 2016 01:19:50 +0000 (19:19 -0600)]
pjproject_bundled: Check for python-dev and TEST_FRAMEWORK
The pjsua and pjsystest apps are now built only if TEST_FRAMEWORK is set.
The python bindings are now built only if TEST_FRAMEWORK is set and a
python development package is installed.
The res_pjsip_authenticator_digest, res_pjsip_endpoint_identifier_*
and res_pjsip_registrar modules should load ASAP
to avoid "No matching endpoint found" for legitimate endpoint.
stasis_endpoints: Add new Status and Headers to ContactStatus
ASTERISK-25903 added a new headers to AMI Event ContactStatusDetail.
ASTERISK-25904 added a new Status to AMI Event ContactStatusDetail.
These additions should be also in stasis_endpoints
to include in command "manager show event ContactStatus"
Joshua Colp [Thu, 5 May 2016 10:07:50 +0000 (07:07 -0300)]
file: Ensure nativeformats remains valid for lifetime of use.
It is possible for the nativeformats of a channel to change
throughout its lifetime. As a result a user of it needs to either
ensure the channel is locked when accessing the formats or keep
a reference to the nativeformats themselves.
This change fixes the file playback support so it keeps a
reference to the nativeformats when accessing things.
With the old SIP module AMI sends PeerStatus event on every
successfully REGISTER requests, ie, on start registration,
update registration and stop registration.
With PJSIP AMI sends ContactStatus only when status is changed.
Regarding registration:
on start registration - Created
on stop registration - Removed
but on update registration nothing
res_fax/t38_gateway: Peer V.21 session is created on wrong channel
The channel and peer V.21 sessions are created on the same channel now.
The peer V.21 session should be created only on peer channel
when one of channel can handle T.38.
Also this patch enable debug for T.38 gateway session
if global fax debug enabled.
George Joseph [Sat, 30 Apr 2016 22:52:47 +0000 (16:52 -0600)]
pjproject_bundled: Various fixes discovered during testing of OSes
For all OSes:
* Disabled third-party codecs in pjproject and added
'--disable-speex-codec --disable-speex-aec --disable-gsm-codec' to the
configure options since we don't use the pjsip codec capability.
FreeBSD:
* Added FreeBSD support to install_prereq.
* Changed pjproject/configure.m4 to use $GNU_MAKE instead of hardcoding "make".
* Added __progname and environ to asterisk.exports.in.
* Reverted the use of ldconfig to create shared library symlinks to ln.
* Only enable epoll in pjproject if `uname -s` is Linux.
* Added a patch to pjproject to take the name of the 'make' command from
an environment variable if supplied. This is needed for the python bindings.
(merged by Teluu into pjproject trunk 5/3/2016)
FreeBSD support isn't complete. Still some general issues regarding
make/gmake having nothing to do with pjproject. With some handholding it DOES
build successfully.
CentOS:
Added 'patch' and 'bzip2' to install_prereq PACKAGES_RH.
CentOS 6/7 32/64 build and run the pjsip testsuite successfully.
Ubuntu:
No changes required.
Ubuntu 15/16 32/64 build and run the pjsip testsuite successfully.
Debian:
No changes required.
Debian 6/7/8 32/64 build and run the pjsip testsuite successfully.
There will utimately be a follow-up patch to create an install_prereq for
the testsuite as I've discovered a few missing requirements.
Andrew Nagy [Thu, 17 Mar 2016 19:29:38 +0000 (12:29 -0700)]
app_voicemail: always copy dynamic struct to avoid race condition
Voicemail email addresses can be corrupt or voicemail
emails can end up being sent to the wrong email address if asterisk is
reading voicemail.conf during a reload and processing an email at the
same time. This patch always copies the struct that would otherwise only
be copied once.
ASTERISK-24463 #close
Reported by: John Campbell
Tested by: Etienne Lessard
Tested by: Andrew Nagy
Change-Id: I3a0643813116da84e2617291903d0d489b7425fb
If the Asterisk system name is set in asterisk.conf, it will be stored
into the "reg_server" field in the ps_contacts table to facilitate
multi-server setups.
When pjsip_parse_uri is called with PJSIP_UNESCAPE_IN_PLACE enabled,
the input uri string will become corrupted if it contains escape sequences.
It's not possible to automatically strdup or strdupa the input string because
the output uri pj_str_t's will have pointers to chunks of the input string.
Getting around this would require more memory management code and wouldn't
be worth the savings of doing the unescape in place.
George Joseph [Tue, 8 Mar 2016 00:34:31 +0000 (17:34 -0700)]
res_pjsip: Add ability to identify by Authorization username
A feature of chan_sip that service providers relied upon was the ability to
identify by the Authorization username. This is most often used when customers
have a PBX that needs to register rather than identify by IP address. From my
own experiance, this is pretty common with small businesses who otherwise
don't need a static IP.
In this scenario, a register from the customer's PBX may succeed because From
will usually contain the PBXs account id but an INVITE will contain the caller
id. With nothing recognizable in From, the service provider's Asterisk can
never match to an endpoint and the INVITE just stays unauthorized.
The fixes:
A new value "auth_username" has been added to endpoint/identify_by that
will use the username and digest fields in the Authorization header
instead of username and domain in the the From header to match an endpoint,
or the To header to match an aor. This code as added to
res_pjsip_endpoint_identifier_user rather than creating a new module.
Although identify_by was always a comma-separated list, there was only
1 choice so order wasn't preserved. So to keep the order, a vector was added
to the end of ast_sip_endpoint. This is only used by res_pjsip_registrar
to find the aor. The res_pjsip_endpoint_identifier_* modules are called in
globals/endpoint_identifier_order.
Along the way, the logic in res_pjsip_registrar was corrected to match
most-specific to least-specific as res_pjsip_endpoint_identifier_user does.
The order is:
username@domain
username@domain_alias
username
Auth by username does present 1 problem however, the first INVITE won't have
an Authorization header so the distributor, not finding a match on anything,
sends a securty_alert. It still sends a 401 with a challenge so the next
INVITE will have the Authorization header and presumably succeed. As a result
though, that first security alert is actually a false alarm.
To address this, a new feature has been added to pjsip_distributor that keeps
track of unidentified requests and only sends the security alert if a
configurable number of unidentified requests come from the same IP in a
configurable amout of time. Those configuration options have been added to
the global config object. This feature is only used when auth_username
is enabled.
Finally, default_realm was added to the globals object to replace the hard
coded "asterisk" used when an endpoint is not yet identified.
The testsuite tests all pass but new tests are forthcoming for this new
feature.
ASTERISK-25835 #close Reported-by: Ross Beer
Change-Id: I30ba62d208e6f63439600916fcd1c08a365ed69d
Mark Michelson [Wed, 27 Apr 2016 18:23:37 +0000 (13:23 -0500)]
func_odbc: Check connection status before executing queries.
A recent change to func_odbc made it so that a single connection was
maintained per DSN. The problem was that the code was optimistic about
the health of the connection after initially opening it and did nothing
to re-connect in case the connection had died.
This change adds a check before executing a query to ensure that the
connection to the database is still up and running.
res_pjsip: disable multi domain to improve realtime performace
This patch added new global pjsip option 'disable_multi_domain'.
Disabling Multi Domain can improve Realtime performance by reducing
number of database requests.
chan_sip: Give more time for TCP/TLS threads to stop.
The unload process currently tells each TCP/TLS to terminate but
does not wait for them to do so. This introduces a race condition
where the container holding the threads may be destroyed before
the threads are able to remove themselves from it. When they
finally do the container is invalid and can't be used causing a
crash.
A previous change existed which waited a bit to wait for any
stranglers to finish. This change extends this and waits longer.
When unloading the app_queue module the members in each queue are
destroyed and as part of this they are removed from the pending
members container. Unfortunately a crash would occur as the container
was destroyed before the members were removed.
This change tweaks ordering so the container destruction occurs
after the members are destroyed.
Merge changes from topic 'system_stress_patches' into 13
* changes:
test_message.c: Wait longer in case dialplan also processes the test message.
Manager: Short circuit AMI message processing.
manager.c: Eliminate most RAII_VAR usage.
manager_channels.c: Fix allocation failure crash.
George Joseph [Mon, 25 Apr 2016 03:51:16 +0000 (21:51 -0600)]
config: Fix ast_config_text_file_save2 writability check for missing files
A patch I did back in 2014 modified ast_config_text_file_save2 to check the
writability of the main file and include files before truncating and re-writing
them. An unintended side-effect of this was that if a file doesn't exist,
the check fails and the write is aborted.
This patch causes ast_config_text_file_save2 to check the writability of the
parent directory of missing files instead of checking the file itself. This
allows missing files to be created again. A unit test was also added to
test_config to test saving of config files.
The regression was discovered when app_voicemail's passwordlocation=spooldir
feature stopped working.
ASTERISK-25917 #close Reported-by: Jonathan Rose
Change-Id: Ic4dbe58c277a47b674679e49daed5fc6de349f80
chan_sip: Make autocreated peers send PeerStatus events
Since Stasis has been introduced, an attempt to send AMI messages by an
autocreated peer caused a crash, and all events from autocreated peers were
semi-inadvertently disabled altogether in 0b83761. This change restores the
disabled functionality.
Kevin Harwell [Thu, 21 Apr 2016 19:23:21 +0000 (14:23 -0500)]
app_queue: queue members can receive multiple calls
It was possible for a queue member that is a member of at least 2 or more
queues to receive mulitiple calls at the same time. This happened because
of a race between when a member was being rung and when the device state
notified the other queue(s) member object of the state change.
This patch makes it so when a queue member is being rung it gets added to
a global pool of queue members. If that same member is tried again, e.g.
from another queue, and it is found to already exist in the pending member
container then it will not ring that member.
George Joseph [Fri, 22 Apr 2016 22:53:23 +0000 (16:53 -0600)]
res_agi: Prevent run_agi from eating frames it shouldn't
The run_agi function is eating control frames when it shouldn't be. This is
causing issues when an AGI is run from CONNECTED_LINE_SEND_SUB in a blond
transfer.
Alice calls Bob. Bob attended transfers to Charlie but hangs up before Charlie
answers.
Alice gets the COLP UPDATE indicating Charlie but Charlie never gets an UPDATE
and is left thinking he's connected to Bob.
In this case, when CONNECTED_LINE_SEND_SUB runs on Alice's channel and it calls
an AGI, the extra eaten frames prevent CONNECTED_LINE_SEND_SUB from running on
Charlie's channel.
The fix was to accumulate deferrable frames in the "forever" loop instead of
dropping them, and re-queue them just before running the actual agi command
or exiting.
Richard Mudgett [Tue, 12 Apr 2016 20:29:52 +0000 (15:29 -0500)]
Manager: Short circuit AMI message processing.
Improve AMI message processing performance if there are no consumers
listening for the messages. We now skip creating the AMI event message
text strings.
Richard Mudgett [Wed, 13 Apr 2016 22:09:53 +0000 (17:09 -0500)]
manager_channels.c: Fix allocation failure crash.
An earlier allocation failure failed to create a channel snapshot for the
AMI HangupRequest/SoftHangupRequest event which resulted in a crash in
channel_hangup_request_cb(). Where the stasis message gets generated
cannot tell if the NULL snapshot returned was because of an allocation
failure or the channel was a dummy channel.
* Made channel_hangup_request_cb() check if the channel blob has a
snapshot and exit if it doesn't.
* Eliminated the RAII_VAR usage in channel_hangup_request_cb().
Richard Mudgett [Wed, 13 Apr 2016 18:20:23 +0000 (13:20 -0500)]
bridge_softmix.c: Fix crash if channel fails to join mixing tech.
softmix_bridge_join() failed because of an allocation failure. To address
this, the softmix bridge technology now checks if the channel failed to
join softmix successfully. In addition, the bridge now begins the process
of kicking the channel out of the bridge so we don't have channels
partially in the bridge for very long.
* Fix the test_channel_feature_hooks.c unit tests. The test channel must
have a valid codec to join the simple_bridge technology. This patch makes
joining a bridge more strict by not allowing partially joined channels to
remain in the bridge.
Mark Michelson [Fri, 22 Apr 2016 18:49:50 +0000 (13:49 -0500)]
func_odbc: Use one connection per DSN.
res_odbc was changed in Asterisk 13.8.0 to remove connection management,
opting instead to let unixodbc maintain open connections and return
those to Asterisk as requested.
This was a boon for realtime, since it meant that multiple threads could
potentially run parallel queries since they could each be using their
own database connections.
However, on the user-facing side, func_odbc, there were some inherent
behaviors being relied on that no longer hold true after the change.
One such reported behavior was that MySQL's LAST_INSERTED_ID() works
per-connection. This means that if Asterisk uses separate connections
for every database operation, whereas before it used one connection for
everything, we have broken expectations and functionality.
The fix provided in this patch is to make func_odbc use a single
database connection per DSN. This way, user-facing database usage will
have the same behavior as it did pre-13.8.0. However, realtime, which is
the real workhorse of database interaction, will continue to let
unixodbc manage connections.
Richard Mudgett [Fri, 15 Apr 2016 19:36:59 +0000 (14:36 -0500)]
res_stasis: Handle re-enter stasis bridge with swap channel.
We lose the fact that there is a swap channel if there is one. We
currently wind up rejoining the stasis bridge as a normal join after the
swap channel has already been kicked from the bridge.
This patch preserves the swap channel so the AMI/ARI events can note that
the channel joining the bridge is swapping with another channel. Another
benefit to swaqpping in one operation is if there are any channels that
get lonely (MOH, bridge playback, and bridge record channels). The lonely
channels won't leave before the joining channel has a chance to come back
in under stasis if the swap channel is the only reason the lonely channels
are staying in the bridge.
ASTERISK-25947 #close
Reported by: Richard Mudgett
Richard Mudgett [Tue, 19 Apr 2016 21:58:32 +0000 (16:58 -0500)]
bridge: Hold off more than one imparting channel at a time.
An earlier patch blocked the ast_bridge_impart() call until the channel
either entered the target bridge or it failed. Unfortuantely, if the
target bridge is stasis and the imprted channel is not a stasis channel,
stasis bounces the channel out of the bridge to come back into the bridge
as a proper stasis channel. When the channel is bounced out, that
released the block on ast_bridge_impart() to continue. If the impart was
a result of a transfer, then it became a race to see if the swap channel
would get hung up before the imparted channel could come back into the
stasis bridge. If the imparted channel won then everything is fine. If
the swap channel gets hung up first then the transfer will fail because
the swap channel is leaving the bridge.
* Allow a chain of ast_bridge_impart()'s to happen before any are
unblocked to prevent the race condition described above. When the channel
finally joins the bridge or completely fails to join the bridge then the
ast_bridge_impart() instances are unblocked.
George Joseph [Tue, 19 Apr 2016 22:52:15 +0000 (16:52 -0600)]
res_pjsip_callerid: Clear out display name if id->name is not valid
When create_new_id_hdr creates a new RPID or PAI header, it starts by cloning
the From header, then it overwrites the display name and uri from the channel's
connected.id. If the connected.id.name wasn't valid, create_new_id_hdr was
leaving the display name from the From header in the new RPID or PAI header.
On an attended transfer where the originator had a caller id number set but not
a display name, the re-INVITE to the final transferee had the number of the
originator but the display name of the transferer.
Added a check to clear out the display name in the new header if
connected.id.name was invalid.
Mark Michelson [Mon, 18 Apr 2016 17:12:37 +0000 (12:12 -0500)]
PJSIP: Remove PJSIP parsing functions from uri length validation.
The PJSIP parsing functions provide a nice concise way to check the
length of a hostname in a SIP URI. The problem is that in order to use
those parsing functions, it's required to use them from a thread that
has registered with PJLib.
On startup, when parsing AOR configuration, the permanent URI handler
may not be run from a PJLib-registered thread. Specifically, this could
happen when Asterisk was started in daemon mode rather than
console-mode. If PJProject were compiled with assertions enabled, then
this would cause Asterisk to crash on startup.
The solution presented here is to do our own parsing of the contact URI
in order to ensure that the hostname in the URI is not too long. The
parsing does not attempt to perform a full SIP URI parse/validation,
since the hostname in the URI is what is important.
Mark Michelson [Mon, 18 Apr 2016 22:00:42 +0000 (17:00 -0500)]
res_pjsip_registrar: Fix bad memory-ness with user_agent.
Recent changes to the PJSIP registrar resulted in tests failing due to
missing AOR_CONTACT_ADDED test events. The reason for this was that the
user_agent string had junk values in it, resulting in being unable to
generate the event.
I'm going to be honest here, I have no idea why this was happening. Here
are the steps needed for the user_agent variable to get messed up:
* REGISTER is received
* First contact in the REGISTER results in a contact being removed
* Second contact in the REGISTER results in a contact being added
* The contact, AOR, expiration, and user agent all have to be passed as
format parameters to the creation of a string. Any subset of those
parameters would not be enough to cause the problem.
Looking into what was happening, the thing that struck me as odd was
that the user_agent variable was meant to be set to the value of the
User-Agent SIP header in the incoming REGISTER. However, when removing a
contact, the user_agent variable would be set (via ast_strdupa inside a
loop) to the stored contact's user_agent. This means that the
user_agent's value would be incorrect when attempting to process further
contacts in the incoming REGISTER.
The fix here is to use a different variable for the stored user agent
when removing a contact. Correcting the behavior to be correct also
means the memory usage is less weird, and the issue no longer occurs.
res_pjsip_transport_management: Allow unload to occur.
At shutdown it is possible for modules to be unloaded that wouldn't
normally be unloaded. This allows the environment to be cleaned up.
The res_pjsip_transport_management module did not have the unload
logic in it to clean itself up causing the res_pjsip module to not
get unloaded. As a result the res_pjsip monitor thread kept going
processing traffic and timers when it shouldn't.
Richard Mudgett [Fri, 15 Apr 2016 16:41:49 +0000 (11:41 -0500)]
bridge_channel.c: Ignore role setup failure in channel push.
We have to setup the channel roles after the bridge class push is called
because the bridge class push callback may have set roles on the incoming
channel. Since we have already partially pushed the channel into the
bridge and reversing what we have already done could be problematic, the
only thing we can do is press on to complete pushing the channel into the
bridge.
* Ignore any channel role setup errors after pushing the channel into a
bridge. The channel may behave incorrectly in the bridge but we can no
longer abort the push at this time.
Mark Michelson [Thu, 14 Apr 2016 18:49:35 +0000 (13:49 -0500)]
transport management: Register thread with PJProject.
The scheduler thread that kills idle TCP connections was not registering
with PJProject properly and causing assertions if PJProject was built in
debug mode.
This change registers the thread with PJProject the first time that the
scheduler callback executes.
There are several places that do scheduled tasks or periodic housecleaning,
each with its own implementation:
* res_pjsip_keepalive has a thread that sends keepalives.
* pjsip_distributor has a thread that cleans up expired unidentified requests.
* res_pjsip_registrar_expire has a thread that cleans up expired contacts.
* res_pjsip_pubsub uses ast_sched directly and then calls ast_sip_push_task.
* res_pjsip_sdp_rtp also uses ast_sched to send keepalives.
There are also places where we should be doing scheduled work but aren't.
A good example are the places we have sorcery observers to start registration
or qualify. These don't work when changes are made to a backend database
without a pjsip reload. We need to check periodically.
As a first step to solving these issues, a new ast_sip_sched facility has
been created.
ast_sip_sched wraps ast_sched but only uses ast_sched as a scheduled queue.
When a task is ready to run, ast_sip_task_pusk is called for it. This ensures
that the task is executed in a PJLIB registered thread and doesn't hold up the
ast_sched thread so it can immediately continue processing the queue. The
serializer used by ast_sip_sched is one of your choosing or a random one from
the res_pjsip pool if you don't choose one.
Another feature is the ability to automatically clean up the task_data when the
task expires (if ever). If it's an ao2 object, it will be dereferenced, if
it's a malloc'd object it will be freed. This is selectable when the task is
scheduled. Even if you choose to not auto dereference an ao2 task data object,
the scheduler itself maintains a reference to it while the task is under it's
control. This prevents the data from disappearing out from under the task.
There are two scheduling models.
AST_SIP_SCHED_TASK_PERIODIC specifies that the invocations of the task occur at
the specific interval. That is, every "interval" milliseconds, regardless of
how long the task takes. If the task takes longer than the interval, it will
be scheduled at the next available multiple of interval. For exmaple: If the
task has an interval of 60 secs and the task takes 70 secs (it better not),
the next invocation will happen at 120 seconds.
AST_SIP_SCHED_TASK_DELAY specifies that the next invocation of the task should
start "interval" milliseconds after the current invocation has finished.
Also, the same ast_sched facility for fixed or variable intervals exists. The
task's return code in conjunction with the AST_SIP_SCHED_TASK_FIXED or
AST_SIP_SCHED_TASK_VARIABLE flags controls the next invocation start time.
One res_pjsip.h housekeeping change was made. The pjsip header files were
added to the top. There have been a few cases lately where I've needed
res_pjsip.h just for ast_sip calls and had compiles fail spectacularly because
I didn't add the pjsip header files to my source even though I never referenced
any pjsip calls.
Finally, a few new convenience APIs were added to astobj2 to make things a
little easier in the scheduler. ao2_ref_and_lock() calls ao2_ref() and
ao2_lock() in one go. ao2_unlock_and_unref() does the reverse. A few macros
were also copied from res_phoneprov because I got tired of having to duplicate
the same hash, sort and compare functions over and over again. The
AO2_STRING_FIELD_(HASH|SORT|CMP)_FN macros will insert functions suitable for
aor_container_alloc into your source.
This facility can be used immediately for the situations where we already have
a thread that wakes up periodically or do some scheduled work. For the
registration and qualify issues, additional sorcery and schema changes would
need to be made so that we can easily detect changed objects on a periodic
basis without having to pull the entire database back to check. I'm thinking
of a last-updated timestamp on the rows but more on this later.