Mark Michelson [Wed, 23 Dec 2015 21:07:05 +0000 (15:07 -0600)]
res_odbc: Remove connection management
Asterisk by default will create a single database connection and share
it among all threads that attempt to access the database. In previous
versions of Asterisk, this was tolerable, because the most used channel
driver, chan_sip, mostly accessed the database from a single thread.
With PJSIP, however, many threads may be attempting to perform database
operations, and there is the potential for many more database accesses,
meaning the concurrency is a horrible bottleneck if only one connection
is shared.
Asterisk has a connection pooling facility built into it, but the
implementation has flaws. For one, there is a strict limit on the number
of simultaneous connections that could be made to the database. Anything
beyond the maximum would result in a failed operation. Attempting to
predict what the maximum should be is nearly impossible even for someone
intimately familiar with Asterisk's threading model. In addition, use of
transactions in the dialplan can cause some severe bugs if connection
pooling is enabled.
This commit seeks to fix the concurrency problem by removing all
connection management code from Asterisk and leaving that to the
underlying unixODBC code instead. Now, Asterisk does not share a single
connection, nor does it try to maintain a connection pool. Instead, all
Asterisk ever does is request a connection from unixODBC and allow
unixODBC to either allocate those connections or retrieve them from a
pool.
Doing this has a bit of a ripple effect. For one, since connections are
not long-lived objects, several of the safeguards that previously
existed have been removed. We don't have to worry about trying to use a
connection that has gone stale. In every case, when we request a
connection, it has just been made and we don't need to perform any
sanity checks to be sure it's still active.
Another major player affected by this change is transactions.
Transactions and their respective connections were so tightly coupled
that it was almost pornographic. This code change moves
transaction-related code to its own file separate from the core ODBC
functionality. This way, the core of ODBC does not even have to know
that transactions exist.
In making this large change, I had to look at a lot of code and
understand it. When making this change, I discovered several places
where the behavior is definitely not ideal, but it seemed outside the
scope of this change to be fixing it. Instead, any place where I saw
some sort of room for improvement has had a XXX comment added explaining
what could be altered to improve it.
Richard Mudgett [Wed, 13 Jan 2016 22:49:22 +0000 (16:49 -0600)]
res_pjsip: Add CLI "pjsip dump endpt [details]"
Dump the res_pjsip endpt internals.
In non-developer mode we will not document or make easily accessible the
"details" option even though it is still available. The user has to know
it exists to use it. Presumably they would also be aware of the potential
crash warning below.
Warning: PJPROJECT documents that the function used by this CLI command
may cause a crash when asking for details because it tries to access all
active memory pools.
George Joseph [Tue, 19 Jan 2016 01:27:57 +0000 (18:27 -0700)]
res_pjproject: Add module providing pjproject logging and utils
res_pjsip_log_forwarder has been renamed to res_pjproject
and enhanced as follows:
As a follow-on to the recent 'Add CLI "pjsip show buildopts"' patch,
a new ast_pjproject_get_buildopt function has been added. It
allows the caller to get the value of one of the buildopts.
The initial use case is retrieving the runtime value of
PJ_MAX_HOSTNAME to insure we don't send a hostname greater
than pjproject can handle. Since it can differ between
the version of pjproject that Asterisk was compiled against
and the version of pjproject that Asterisk is running against,
we can't use the PJ_MAX_HOSTNAME macro directly in Asterisk
source code.
Matt Jordan [Mon, 18 Jan 2016 23:16:24 +0000 (17:16 -0600)]
funcs/func_cdr: Correctly report high precision values for duration and billsec
When CDRs were refactored, func_cdr's ability to report high precision values
for duration and billsec (the 'f' option) was broken. This was due to func_cdr
incorrectly interpreting the duration/billsec values provided by the CDR engine
in milliseconds, as opposed to seconds. Since the CDR engine only provides
duration and billsec in seconds, and does not expose either attribute with
sufficient precision to merely pass back the underlying value, this patch fixes
the bug by re-calculating duration and billsec with microsecond precision based
on the start/answer/end times on the CDR.
Joshua Colp [Tue, 19 Jan 2016 23:15:50 +0000 (19:15 -0400)]
test_threadpool: Wait for each task to complete and fix memory leak.
This change makes the thread_timeout_thrash unit test wait for
each task to complete. This fixes the problem where the test would
prematurely end when all threads were gone and a new one had to be
started to handle the last task. It also increases the thrasing as
it is now more likely for each task to encounter the above scenario.
This also fixes a memory leak where the data for each task was not
being freed.
Richard Mudgett [Tue, 19 Jan 2016 01:43:41 +0000 (19:43 -0600)]
taskprocessor.c: Fix some taskprocessor unrefs.
You have to call ast_taskprocessor_unref() outside of the taskprocessor
implementation code. Taskprocessor use since v12 has become more
transient than just the singleton uses in earlier versions.
Kevin Harwell [Thu, 14 Jan 2016 20:42:57 +0000 (14:42 -0600)]
bridge_basic: don't cache xferfailsound during an attended transfer
The xferfailsound was read from the channel at the beginning of the transfer,
and that value is "cached" for the duration of the transfer. Therefore, changing
the xferfailsound on the channel using the FEATURE() dialplan function does
nothing once the transfer is under way.
This makes it so the transfer code instead gets the xferfailsound configuration
options from the channel when it is actually going to be used.
This patch also fixes a potential memory leak of the props object as well as
making sure the condition variable gets initialized before being destroyed.
Kevin Harwell [Thu, 14 Jan 2016 22:00:50 +0000 (16:00 -0600)]
bridge_basic: don't play an attended transfer fail sound after target hangs up
If the attended transfer destination answers (picks call up or goes to
voicemail) and then hangs up on the transferer then transferer hears the
fail sound.
This patch makes it so the fail sound is not played when the transfer
destination/target hangs up after answering.
Corey Farrell [Thu, 14 Jan 2016 20:36:10 +0000 (15:36 -0500)]
Remove *.gcna / *.gcno files from added module sources.
Asterisk uses a Makefile macro to associate additional sources with a
module. This macro is responsible for creating clean targets but
previously left behind *.gcna and *.gcno files.
Daniel Journo [Sun, 10 Jan 2016 22:22:12 +0000 (22:22 +0000)]
pjsip: Add option global/regcontext
Added new global option (regcontext) to pjsip. When set, Asterisk will
dynamically create and destroy a NoOp priority 1 extension
for a given endpoint who registers or unregisters with us.
ASTERISK-25670 #close Reported-by: Daniel Journo
Change-Id: Ib1530c5b45340625805c057f8ff1fb240a43ea62
Joshua Colp [Tue, 12 Jan 2016 17:14:29 +0000 (13:14 -0400)]
app: Queue hangup if channel is hung up during sub or macro execution.
This issue was exposed when executing a connected line subroutine.
When connected or redirected subroutines or macros are executed it is
expected that the underlying applications and logic invoked are fast
and do not consume frames. In practice this constraint is not enforced
and if not adhered to will cause channels to continue when they shouldn't.
This is because each caller of the connected or redirected logic does not
check whether the channel has been hung up on return. As a result the
the hung up channel continues.
This change makes it so when the API to execute a subroutine or
macro is invoked the channel is checked to determine if it has hung up.
If it has then a hangup is queued again so the caller will see it
and stop.
Joshua Colp [Tue, 12 Jan 2016 19:25:48 +0000 (13:25 -0600)]
Merge topic 'update_taskprocessor_commands'
* changes:
Sorcery: Create human friendly serializer names.
Stasis: Create human friendly taskprocessor/serializer names.
taskprocessor.c: New API for human friendly taskprocessor names.
taskprocessor.c: Sort CLI "core show taskprocessors" output.
Mark Michelson [Tue, 12 Jan 2016 16:36:15 +0000 (10:36 -0600)]
res_sorcery_realtime: Remove leading ^ requirement.
res_sorcery_realtime's search-by-regex callback performed a check to
ensure that the passed-in regex began with a caret (^). If it did not,
then no results would be returned.
This callback only started to become used when "like" support was added
to PJSIP CLI commands. The CLI command for listing objects would pass an
empty regex ("") to the sorcery backend if no "like" statement was
present. For most sorcery backends, this resulted in returning all
objects. However, for realtime, this resulted in returning no objects.
This commit seeks to fix the regression by removing the requirement from
res_sorcery_realtime for the passed-in-regex to begin with a caret.
On a system with multiple ip addresses in the same subnet, if a
transport is bound to a specific ip address and endpoint/media_address
is set, the SIP/SDP will have the correct address in all fields but
the rtp stream MAY still originate from one of the other ip addresses,
most probably the "primary" ip address. This happens because
res_pjsip_sdp_rtp/create_rtp always calls ast_instance_new with
the "all" ip address (0.0.0.0 or ::).
The new option causes res_pjsip_sdp_rtp/create_rtp to call
ast_rtp_instance_new with the endpoint's media_address (if specified)
instead of the "all" address. This causes the packets to originate from
the specified address.
ASTERISK-25632
ASTERISK-25637 Reported-by: Olivier Krief Reported-by: Dan Journo
Change-Id: I3dfaa079e54ba7fb7c4fd1f5f7bd9509bbf8bd88
Kevin Harwell [Fri, 8 Jan 2016 22:59:44 +0000 (16:59 -0600)]
pbx: Deadlock between contexts container and context_merge locks
Recent changes (ASTERISK-25394 commit 2bd27d12223fe33b58c453965ed5c6ed3af7c4f5)
introduced the possibility of a deadlock. Due to the mentioned modifications
ast_change_hints now needs to keep both merge/delete and state callbacks from
occurring while it executes. Unfortunately, sometimes ast_change_hints can be
called with the contexts container locked. When this happens it's possible for
another thread to grab the context_merge_lock before the thread calling into
ast_change_hints does and then try to obtain the contexts container lock. This
of course causes a deadlock between the two threads. The thread calling into
ast_change_hints waits for the other thread to release context_merge_lock and
the other thread is waiting on that one to release the contexts container lock.
Unfortunately, there is not a great way to fix this problem. When hints change,
the subsequent state callbacks cannot run at the same time as a merge/delete,
nor when the usual state callbacks do. This patch alleviates the problem by
having those particular callbacks (the ones run after a hint change) occur in a
serialized task. By moving the context_merge_lock to a task it can now safely be
attempted or held without a deadlock occurring.
ASTERISK-25640 #close
Reported by: Krzysztof Trempala
This patch causes another problem and should not have been needed.
Before this patch, persistent_endpoint_contact_deleted_observer WAS
deleting the contact_status when ast_sip_location_delete_contact was
called. By deleting it yourself in ast_sip_location_delete_contact
it was gone before the observer could run and the observer therefore
was throwing an error and not sending stasis/AMI/statsd messages.
So, I don't think this was the cause of your original issue. I also
had verified the contact AMI and statsd lifecycle and it was working.
I'll double check now though.
ASTERISK-25675 Reported-by: Daniel Journo
Change-Id: Ib586a6b7f90acb641b0c410f659743ab90e84f1a
Richard Mudgett [Thu, 7 Jan 2016 01:00:27 +0000 (19:00 -0600)]
ccss.c: Replace space in taskprocessor name.
The CLI "core ping taskprocessor" command does not work very
well with taskprocessor names that have spaces in them. You
have to put quotes around the name so using tab completion
becomes awkward.
include/asterisk/time.h: Renamed global declaration:tv
Renamed global declaration:tv to dummy_tv_var_for_types,
which would oltherwise cause 'shadow' warnings when 'tv'
was declared as a local variable elsewhere.
Added comment to note that dummy_tv_var_for_types is never
really exported and only used as a place holder.
Mark Michelson [Thu, 7 Jan 2016 21:37:36 +0000 (15:37 -0600)]
PJSIP: Prevent deadlock due to dialog/transaction lock inversion.
A deadlock was observed where the monitor thread was stuck, therefore
resulting in no incoming SIP traffic being processed.
The problem occurred when two 200 OK responses arrived in response to a
terminating NOTIFY request sent from Asterisk. The first 200 OK was
dispatched to a threadpool worker, who locked the corresponding
transaction. The second 200 OK arrived, resulting in the monitor thread
locking the dialog. At this point, the two threads are at odds, because
the monitor thread attempts to lock the transaction, and the threadpool
thread loops attempting to try to lock the dialog.
In this case, the fix is to not have the monitor thread attempt to hold
both the dialog and transaction locks at the same time. Instead, we
release the dialog lock before attempting to lock the transaction.
There have also been some debug messages added to the process in an
attempt to make it more clear what is going on in the process.
Walter Doekes [Wed, 6 Jan 2016 13:12:40 +0000 (14:12 +0100)]
Add sipp-sendfax.xml and spandspflow2pcap.py to contrib/scripts.
The spandspflow2pcap.py creates pcap files from fax.log files, generated
through 'fax set debug on' when receiving a fax. An example fax.log is
included as spandspflow2pcap.log.
The sipp-sendfax.xml SIPp scenario can be used to replay that fax with a
recent version of SIPp.
Aaron An [Mon, 4 Jan 2016 10:26:55 +0000 (18:26 +0800)]
cel/cel_radius: Fix wrong pointer.
The macro ADD_VENDOR_CODE defined in the cel_radius.c should use the parameter
y not the address of y.
I capture the radius UDP packet via tcpdump, and the AV pairs are not correct,
then i review the source code and compare it with cdr/cdr_radius.c. Fix it and
it works.
ASTERISK-25647 #close
Reported by: Aaron An
Tested by: Aaron An
Martin Tomec [Mon, 21 Dec 2015 17:07:14 +0000 (18:07 +0100)]
app_queue: Add member flag "in_call" to prevent reading wrong lastcall time
Member lastcall time is updated later than member status. There was chance to
check wrapuptime for available member with wrong (old) lastcall time.
New boolean flag "in_call" is set to true right before connecting call, and
reset to false after update of lastcall time. Members with "in_call" set to true
are treat as unavailable.