Illustration by Anthony Russo - russoart.com

Frequently Asked Questions

Zfone's principal designer Phil Zimmermann answers a few questions on Zfone™ and ZRTP. Some of these answers are aimed at potential OEM customers in the VoIP industry who are familiar with VoIP jargon. Lots of Wikipedia links are provided to explain the jargon.

What is Zfone?
Can you show me a live demo?
Why do we need Zfone? For that matter, why do we even need secure VoIP at all?
Why is Zfone better?
Is there an IETF RFC for the Zfone protocol?
Is Zfone and ZRTP CALEA compliant?
What if I need to deploy a system that allows lawful interception within my organization?
Will the government attempt to stop VoIP encryption?
Can ZRTP use a PKI (public key infrastructure)?
Exactly how does Zfone and ZRTP protect against a man-in-the-middle (MiTM) attack?
ZRTP vs. other protocols:
- What about DTLS-SRTP? Why not use that?
- Why do we need ZRTP if we already have SRTP? Isn't SRTP good enough?
- My VoIP phone already uses SDES to negotiate keys. Is that safe?
- Why can't I just use IPsec to encrypt my VoIP calls?
I'm a VoIP developer. How can I get the ZRTP protocol implemented in my VoIP application?
What's the difference between ZRTP, Zfone, and the libZRTP SDK?
Is Zfone open source?
Is Zfone covered by any patents?
Do you support Elliptic Curve Diffie-Hellman?
Does Zfone or ZRTP work with ...? :
- Does Zfone work with Skype?
- Does Zfone work with plain old telephone service (POTS) phones?
- Can I use Zfone or ZRTP to call regular landline or mobile telephone numbers?
- My VoIP service provider (such as Vonage or AT&T) gave me an ATA (Analog Telephony Adapter), or VoIP router, that allows me to connect my old-fashioned telephone to my broadband connection. Will Zfone work with that?
- Does ZRTP work with the Asterisk PBX? What about IAX?
- Does ZRTP work with the FreeSWITCH PBX?
- Can ZRTP be used in conference calls?
- Can ZRTP be used with H.323 or other signaling protocols?
- Which platforms do you support?
Do both parties need Zfone (or products that use the Zfone protocol) to make a secure call?
How do ZRTP's key continuity features compare with SSH?
Isn't it a protocol layer violation to do the key management in the media instead of in the signaling?
Many VoIP clients include some form of built-in text chat or instant messaging. Does Zfone or ZRTP encrypt those text messages?
What about vulnerabilities?
- Does Zfone have any "back doors"?
- Has anyone done any real security analysis on Zfone or ZRTP?
- Will Zfone's protocol pass FIPS-140 validation?
- Is the Short Authentication String (SAS) vulnerable to an attacker with voice impersonation capabilities?
- Does Zfone and ZRTP encrypt Touch-Tone keypad DTMF tones?
- What if I use a Variable Bit Rate (VBR) codec? Won't that leak information?
Why do I have to register in order to download Zfone?
If I integrate your SDK into my VoIP application, will I have to worry about U.S. export controls?
Does Zfone protect against "social network analysis" and other forms of analysis based on traffic patterns?
Why is it called ZRTP?
Does Zfone or ZRTP slow down the VoIP call?
Why does Zfone show the IDLE status during some VoIP calls?
How does ZRTP verify the identity of who you call?

Q: What is Zfone?

A: Zfone™ is my new secure VoIP phone software which lets you make secure encrypted phone calls over the Internet. The ZRTP protocol used by Zfone will soon be integrated into many standalone secure VoIP clients, but today we have a software product that lets you turn your existing VoIP client into a secure phone. The current Zfone software runs in the Internet protocol stack on any Windows XP, Vista, Mac OS X, or Linux PC, and intercepts and filters all the VoIP packets as they go in and out of the machine, and secures the call on the fly. You can use a variety of different software VoIP clients to make a VoIP call. The Zfone software detects when the call starts, and initiates a cryptographic key agreement between the two parties, and then proceeds to encrypt and decrypt the voice packets. It has its own little separate GUI, telling the user if the call is secure. It's as if Zfone were a "bump on the cord", sitting between the VoIP client and the Internet. Think of it as a bump in the protocol stack.

Q: Can you show me a live demo?

A: Funny you should ask. Here is a presentation and demo I did at the DEFCON conference in August 2007.

Q: Why do we need Zfone? For that matter, why do we even need secure VoIP at all?

As VoIP grows into a replacement
for the PSTN, we will absolutely
need to protect it, or organized crime
will be attacking it as intensively as
they attack the rest of the Internet today.

A: As VoIP grows into a replacement for the PSTN, we will absolutely need to protect it, or organized crime will be attacking it as intensively as they attack the rest of the Internet today. VoIP is far more vulnerable to interception than the PSTN. A PC on your office network can unknowingly host spyware that can intercept your corporate VoIP calls and store and organize them on a hard disk for convenient browsing by criminals half a world away, giving them trade secrets and insider trading opportunities. To see an example of an actual implementation of this kind of spyware, take a look at Peter Cox's SIPtap demo..

The Internet is not a safe medium to carry our phone calls. But Zfone solves these problems. This technology has social benefits. It has the power to change our lives, enabling us to have a private conversation any time we want with anyone, anywhere - without buying a plane ticket.

Q: Why is Zfone better?

A: The ZRTP protocol has some nice cryptographic features lacking in many other approaches to VoIP encryption. Although it uses a public key algorithm, it avoids the complexity of a public key infrastructure (PKI). In fact, it does not use persistent public keys at all. It uses ephemeral Diffie-Hellman with hash commitment, and allows the detection of man-in-the-middle (MiTM) attacks by displaying a short authentication string for the users to verbally compare over the phone. It has perfect forward secrecy, meaning the keys are destroyed at the end of the call, which precludes retroactively compromising the call by future disclosures of key material. But even if the users are too lazy to bother with short authentication strings, we still get fairly decent authentication against a MiTM attack, based on a form of key continuity. It does this by caching some key material to use in the next call, to be mixed in with the next call's DH shared secret, giving it key continuity properties analogous to SSH. All this is done without reliance on a PKI, key certification, trust models, certificate authorities, or key management complexity that bedevils the email encryption world. It also does not rely on SIP signaling for the key management, and in fact does not rely on any servers at all. It performs its key agreements and key management in a purely peer-to-peer manner over the RTP packet stream. And it supports opportunistic encryption by auto-sensing if the other VoIP client supports ZRTP.

ZRTP doesn't need a PKI, and there are good reasons why it's a mistake to require a PKI for secure VoIP. Plus, there have been a growing number of spectacular security failures of traditional PKIs. Nonetheless, ZRTP can use a PKI if you already have one up and running. Follow this link for how this is done.

Q: Is there an IETF RFC for the Zfone protocol?

A: Yes, the IETF has published RFC 6189 that defines the ZRTP protocol used by Zfone to set up the cryptographic key agreement.

Also note that the XMPP Standards Foundation has a protocol extension for "Use of ZRTP in Jingle RTP Sessions", XEP-0262.

Q: Is Zfone and ZRTP CALEA compliant?

Only Zfone's end users are involved
in the key negotiation, and CALEA
does not apply to end users.

A: Zfone's architecture likely renders that question moot. The Communications Assistance for Law Enforcement Act applies in the US to the PSTN phone companies and VoIP service providers, such as Vonage. CALEA imposes requirements on VoIP service providers to give law enforcement access to whatever they have at the service provider, which would be only encrypted voice packets. ZRTP does all its key management in a peer-to-peer manner, so the service provider does not have access to any of the keys. Only Zfone's end users are involved in the key negotiation, and CALEA does not apply to end users.

Here is the operative language from CALEA itself:

and

Also, from the CALEA legislative history:

Finally, telecommunications carriers have no responsibility to decrypt encrypted communications that are the subject of court-ordered wiretaps, unless the carrier provided the encryption and can decrypt it. This obligation is consistent with the obligation to furnish all necessary assistance under 18 U.S.C. Section 2518(4). Nothing in this paragraph would prohibit a carrier from deploying an encryption service for which it does not retain the ability to decrypt communications for law enforcement access. [...] Nothing in the bill is intended to limit or otherwise prevent the use of any type of encryption within the United States. Nor does the Committee intend this bill to be in any way a precursor to any kind of ban or limitation on encryption technology. To the contrary, section 2602 protects the right to use encryption.

However, there is one usage scenario for the ZRTP protocol that may be subject to CALEA. Consider the case of a VoIP service provider that operates a PBX (such as Asterisk or FreeSWITCH) or conference call mixer for its customers. The service provider can implement the ZRTP protocol in the PBX, and this PBX can terminate the ZRTP media connections for both parties, acting as a trusted man-in-the-middle between two ZRTP-equipped end users, or act as a conference call mixer for several users. In this case, the PBX or conference bridge would be in a physical position to provide law enforcement access to either ZRTP or SRTP key material or actual plaintext media traffic, and thus be subject to CALEA. The usual end-to-end nature of ZRTP would be subverted. If your threat model includes this scenario, you may want to try to arrange direct end-to-end ZRTP connections whenever possible.

Q: What if I need to deploy a system that allows lawful interception within my organization?

A: If you must meet this requirement, there is a way to do it outside the scope of ZRTP, with the help of a PBX. A PBX may be configured to handle media in one of three ways:

The PBX can operate purely as a SIP server, allowing the media streams to flow peer-to-peer directly between the clients, bypassing the PBX. This allows ZRTP to achieve end-to-end security, preventing interception, lawful or otherwise.
The PBX can operate as a media relay, passing the media streams through the PBX without terminating them or modifying them, similar to what a TURN server does. The media remains encrypted end-to-end, which prevents interception.
The PBX can operate as a trusted man-in-the-middle, terminating the media streams for both parties at the PBX. This also terminates the ZRTP encryption for both parties at the PBX. This is mathematically equivalent to a classic man-in-the-middle attack, but it's not really an attack if the clients trust the PBX and consent to this. ZRTP has a mechanism to allow a client to recognize a trusted PBX to act as a "friendly" man-in-the-middle. This allows conference mixing, transcoding, and lawful interception of plaintext media, all within the confines of the trusted PBX.

The third option, involving a trusted PBX, is the operative one. Lawful interceptions may be performed at the PBX, where the encrypted media stream has been terminated and the media has been converted to plaintext, outside the zone of ZRTP protection.

Q: Will the government attempt to stop VoIP encryption?

If we fail to encrypt VoIP,
organized crime will be able to
wiretap prosecutors and judges,
revealing details of ongoing
investigations, and conversations
with their wives about what time
to pick up their kids at school.

A: It's a fair question to ask in a post-9/11 world. Just how likely would it be for the government to restrict the end user's use of secure VoIP? The question of whether strong cryptography should be restricted by the government was debated all through the 1990's. This debate had the participation of the White House, the NSA, the FBI, the courts, the Congress, the computer industry, civilian academia, and the press. This debate fully took into account the question of terrorists using strong crypto, and in fact, that was one of the core issues of the debate. Nonetheless, society's collective decision (over the FBI's objections) was that on the whole, we would be better off with strong crypto, unencumbered with government back doors. The export controls were lifted and no domestic controls were imposed. This was a good decision, because we took the time and had such broad expert participation. The 9/11 attacks did not change the wisdom of that collective decision, and although civil liberties on the whole have eroded since then, we haven't lost our right to use strong crypto.

The law enforcement community will be understandably concerned about the effects encrypted VoIP will have on their ability to perform lawful intercepts. But what will be the overall effects on the criminal justice system if we fail to encrypt VoIP? Historically, law enforcement has benefited from a strong asymmetry in the feasibility of government or criminals wiretapping the PSTN. As we migrate to VoIP, that asymmetry collapses. VoIP interception is so easy, organized crime will be able to wiretap prosecutors and judges, revealing details of ongoing investigations, names of witnesses and informants, and conversations with their wives about what time to pick up their kids at school. The law enforcement community will come to recognize that VoIP encryption actually serves their vital interests.

In the early 1990s, the government tried to control the end user's use of crypto by introducing the Clipper chip. That didn't go over too well politically, and had to be abandoned. The government will find it difficult to try again to stop end users from encrypting their traffic, regardless of whether that traffic is email, e-commerce web transactions, or VoIP calls.

Further, the government would have to force everyone to abandon peer-to-peer communication protocols in favor of centralized, old Eastern-Bloc-style, panoptic ways of doing things. That's not the direction technology has been heading. Rather than a "war on terrorism", the government would have to conduct a war on technology.

Q: Can ZRTP use a PKI (public key infrastructure)?

A: Yes. The ZRTP protocol does have the optional capability to use a PKI if you already have a PKI up and running. But ZRTP does not actually require a PKI.

It's a mistake to make a secure
VoIP protocol require a PKI.
There are major problems and
complexities with building,
maintaining, and relying on PKI.

First let's review some good reasons why it's a mistake to make a secure VoIP protocol require a PKI. There are major problems and complexities with building, maintaining, and relying on PKI. That's why in the 1990s, a number of companies died trying to build and market PKI technology. See Ellison and Schneier's paper Ten Risks of PKI: What You're Not Being Told About Public Key Infrastructure and Ellison's paper Improvements on Conventional PKI Wisdom. In the email encryption world, a PKI architecture was the kiss of death for PEM and MOSS, both of which were swept aside by PGP. This also led to S/MIME never reaching critical mass, despite its advantage of being bundled in Microsoft's products. PGP became the dominant email encryption standard because it did not depend on a centrally managed PKI.

Plus, there have been a growing number of spectacular security failures of traditional PKIs, notably, the Comodo and DigiNotar debacles, and the stolen certificate-signing keys that enabled the Stuxnet worm attack.

Nonetheless, if you feel you must use a PKI and already have one, here's how ZRTP can make use of it.

The ZRTP spec (RFC 6189) describes how ZRTP can use a PKI-backed digital signature key to sign the short authentication string (SAS) in the ZRTP CONFIRM packet, to reduce reliance on users verbally comparing them during the call. Organizations that feel comfortable with PKIs can still get what they want. Thus, ZRTP offers all of the advantages of a protocol that can use a PKI, without actually becoming dependent on a PKI for security.

There is another way for a ZRTP implementation to benefit from a PKI, without becoming dependent on one. The IETF plans to someday add integrity protection to the delivery of SIP information, and that integrity protection will rely on a PKI. If this ever happens, ZRTP has protocol features that can leverage an integrity-protected SIP layer to provide integrity protection for ZRTP's Diffie-Hellman exchange in the media layer. Which thus confers protection against a man-in-the-middle (MiTM) attack, without requiring the users to verbally compare the SAS.

Q: Exactly how does Zfone and ZRTP protect against a man-in-the-middle (MiTM) attack?

Two human beings verbally compare the
Short Authentication String, drawing the
human brain directly into the protocol.
And this is a Good Thing.

A: The Diffie-Hellman key exchange by itself does not provide protection against man-in-the-middle (MiTM) attacks. To authenticate the key exchange, ZRTP uses a Short Authentication String (SAS), which is essentially a cryptographic hash of the two Diffie-Hellman values. The SAS value is rendered to both ZRTP endpoints. To carry out authentication, this SAS value is read aloud to the communication partner over the voice connection. If the values on both ends do not match, it indicates the presence of a man-in-middle attack. If they do match, there is a high probability that no man-in-the-middle is present. The use of hash commitment in the DH exchange constrains the attacker to only one guess to generate the correct SAS in his attack, which means the SAS can be quite short. A 16-bit SAS, for example, provides the attacker only one chance out of 65536 of not being detected.

ZRTP provides a second layer of authentication against a MiTM attack, based on a form of key continuity. It does this by caching some hashed shared key material to use in the next call, to be mixed in with the next call's DH shared secret, giving it key continuity properties analogous to SSH. If the MiTM is not present in the first call, he is locked out of subsequent calls. Thus, even if the SAS is never used, most MiTM attacks are stopped, because they weren't present in the first call. ZRTP's key continuity features have some self-healing properties that are better than the SSH approach.

ZRTP provides yet a third layer of protection against a MiTM attack. The IETF plans to add integrity protection to the delivery of SIP information, and that integrity protection will rely on a PKI. When this eventually deploys, ZRTP can take advantage of this. See RFC 6189 on how ZRTP can leverage an integrity-protected SIP layer to provide integrity protection for ZRTP's Diffie-Hellman exchange in the media layer. This protects against a MiTM attack, without requiring the users to verbally compare the SAS. However, no VoIP clients yet offer a fully implemented SIP stack that provides end-to-end integrity protection for the delivery of SIP information. Thus, real-world implementations of ZRTP endpoints will continue to depend on SAS authentication for quite some time. Even after there is widespread availability of SIP products that offer integrity protection, many users will still be faced with the fact that the signaling path may be controlled by institutions that do not have the best interests of the end user in mind. In those cases, ZRTP's built-in SAS authentication will remain the gold standard for the prudent user. That, plus the key continuity features.

Q: What about DTLS-SRTP? Why not use that?

A: DTLS-SRTP is a derivative of TLS, a protocol designed for web browsers to talk to web servers, using a centrally managed PKI. A client-server protocol like TLS can work well in a client-server environment, but a phone call between two human beings is an ad-hoc peer-to-peer relationship, and the cryptographic key negotiations should reflect that. Instead of recycling a client-server protocol, ZRTP is purpose-built for VoIP. All these cryptographic protocols have a goal of negotiating keys in a way that stops man-in-the-middle (MiTM) attacks. To accomplish this, ZRTP doesn't need a PKI, and we don't need help from servers controlled by the phone company. Instead, ZRTP has two humans verbally compare a short authentication string to detect if there is a MiTM. Human beings can readily see if there is a MiTM by direct evidence and common sense. ZRTP harnesses the immense resources at both endpoints, which each have a brain with a hundred trillion synapses and the unique power of human intuition.

DTLS-SRTP tries to repurpose itself to VoIP's peer-to-peer environment, but it cannot escape its client-server roots, and that's why it depends so completely on the SIP servers to secure the connection. DTLS-SRTP's MiTM protection collapses in the absence of end-to-end integrity protection in the SIP layer. The only mechanism for this in SIP (besides S/MIME which has been around for 6 years without any implementation) is Enhanced SIP Identity (RFC 4474). However, it turns out that if you are using your SIP phone to call a regular phone number, then RFC 4474 doesn't provide integrity protection, and MiTM protection for DTLS-SRTP collapses. Why? Because for a regular phone number, the SIP identity is of the form sip:+13145551212@example.com asserted by example.com. A MiTM can just remove the RFC 4474 signature, change the a=fingerprint, then re-sign the identity as sip:+13145551212@example2.com asserted by example2.com. DTLS-SRTP Elephant in the Room - cartoon How does the callee know that this phone number is actually originating from example.com and not example2.com? There is no way to tell, hence, DTLS-SRTP has no protection from MiTM attacks. Regular phone numbers will be commonly used as identifiers for SIP phone customers for a long time, so this will continue to be a major security weakness for DTLS-SRTP.

Even if this problem with regular phone numbers is somehow solved, we are still left with the Elephant in the Room that in the final analysis, the security of DTLS-SRTP requires a PKI. The PKI dependency will either be contained within DTLS-SRTP itself or within SIP, because of the DTLS-SRTP dependency on SIP end-to-end integrity. All SIP end-to-end integrity mechanisms require a PKI, and all the complexity and bureaucracy that implies. Many years of experience in the crypto industry leads us to believe that PKI is an inappropriate approach to achieving media security in VoIP.

Some vendors who plan to implement DTLS-SRTP products say they will use self-signed public key certificates if no PKI is available. But a self-signed certificate offers no protection against a MiTM attack. If they don't use a PKI, and have no other MiTM attack countermeasure, such as key continuity or a short authentication string, it just won't be a secure phone.

Although far less important than the aforementioned pachyderm, here's another strike against this protocol: DTLS-SRTP must bear the additional cost of a signature calculation of its own, in addition to the signature calculation the SIP layer uses to achieve its integrity protection. ZRTP needs no signature calculation of its own to leverage the signature calculation carried out in the SIP layer. This may be relevant in low-power mobile platforms, or in highly loaded servers.

Q: Why do we need ZRTP if we already have SRTP? Isn't SRTP good enough?

A: This is the wrong question to ask. Despite the similarity in the two names, it is not a choice between SRTP and ZRTP. SRTP is the protocol we use to encrypt the low level voice packets. But you cannot use SRTP until both parties have agreed on what key to use for the SRTP encryption. The SRTP protocol (RFC 3711) says nothing about how session keys are negotiated. That's where ZRTP (RFC 6189) comes in. ZRTP is the protocol that the two parties use to negotiate the SRTP session key. Zfone uses SRTP, but it uses ZRTP first to negotiate the SRTP session key. There are several different protocols that may be used to negotiate SRTP session keys, including ZRTP, SDES, or DTLS. Of course, we think ZRTP is the best one.

But wait. When you say you are already using SRTP, what do you mean, exactly? Too many people in the VoIP industry have unfortunately started misusing the term SRTP as shorthand for "SRTP with keys negotiated via SDES". This wrongly presumes SDES is the only way to negotiate SRTP session keys. Which brings us to the next question.

Q: My VoIP phone already uses SDES to negotiate keys. Is that safe?

A: Good heavens, no! While most VoIP phones don't encypt their calls at all, a few of them have implemented SDP Security Descriptions (SDES) (RFC 4568) to negotiate SRTP session keys to encrypt the call. Of all the methods the IETF has considered for this, SDES is arguably the least secure. Here's how it works. Suppose Alice wants to talk to Bob, who lives in China. Alice's phone generates a random session key to encrypt the conversation, but somehow Alice has to get this key into Bob's hands so they can both use it. Her phone transmits this key via SIP to her VoIP service provider, namely her local phone company. Her phone company, who now has full knowledge of this session key, transmits it to Bob's phone company in China. Bob's phone company, owned by the Chinese government who now has full knowledge of the session key, transmits it to Bob's phone. Now Alice and Bob's phones are ready to start an encrypted voice conversation. What's wrong with this picture?

If Alice wants to talk with Bob about human rights issues, or how they might overcome trade barriers, the Chinese government can easily monitor the call. To stay competitive in a global economy, it's important that a company use end-to-end encryption to protect its business communications from foreign governments. And some of us have doubts about whether our domestic phone company will always act with our best interests in mind.

If PGP Corp had implemented such an embarrassingly bad protocol, it would be met with shocked disbelief in the crypto community. But VoIP product vendors seem to get away with it, probably because crypto is not part of the VoIP industry's core competency. I've talked to VoIP vendors who just shrug and candidly admit they implemented SDES so they can simply check the "supports encryption" checkbox on their product feature checklist. Their excuse is that their customers have not demanded anything better.

Q: Why can't I just use IPsec to encrypt my VoIP calls?

A: Well, you can, but it would be a bad idea in most cases. IPsec encryption is done down in the IP layer of the Internet protocol suite's protocol stack, which is too low a layer to let the user know if it is running. Some routers support IPsec, and some don't. You don't know if the other party supports IPsec, so some connections will be not encrypted, and you would never know it. If you don't know for sure whether the call is encrypted, what good is it? It's better to do the encryption at the application layer, so that the user can be told if the call is encrypted.

Q: I'm a VoIP developer. How can I get the ZRTP protocol implemented in my VoIP application?

A: We have a nice SDK for VoIP developers to integrate the ZRTP protocol into their software or hardware VoIP clients or servers. The software is implemented in C. Visit the Zfone libZRTP SDK page to license it, and reduce your time-to-market quite a bit. We also have comprehensive API documentation for the SDK. And to help you debug your product by using your Wireshark packet sniffer to recognize and dissect ZRTP packets, check out our nifty ZRTP packet dissector for Wireshark.

Q: What's the difference between ZRTP, Zfone, and the libZRTP SDK?

A: ZRTP is the protocol specification we developed that describes how to negotiate session keys for a secure VoIP phone call. We have a software development kit, a subroutine library for developers that implements the ZRTP protocol, called libZRTP, which can be integrated into VoIP applications. And we have a complete software application called Zfone that showcases the libZRTP SDK. Zfone is ready to run for end users, not just VoIP developers. Zfone is not itself a VoIP client-- it is an "adapter" that lets you make secure phone calls using your favorite VoIP client. If you are developing your own VoIP application and you want to build in ZRTP capabilities, you should get our libZRTP SDK. If you just want to make secure VoIP phone calls as an end user, get Zfone.

Q: Is Zfone open source?

A: No. The Zfone application is not available under the GPL, the AGPL, or any other open source license. However, the Zfone libZRTP SDK library is published under the AGPL open source license. The SDK is also available under a commercial license, in a dual-licensing model. See our Licensing Policy page for details.

I'm a firm believer in publishing the source code for cryptographic software for peer review, to build public confidence that it contains no back doors, a tradition I started in 1991 with PGP. PGP is a proprietary product, even though the source code is available for peer review. Publishing the source code for peer review is not the same as making it availble under an open source license.

Zfone has several major components, and not all of them are published under the same licensing terms. The entire body of source code for the complete Zfone client software is published for peer review purposes. In addition, the Zfone libZRTP SDK library is published under the AGPL version 3. However, the rest of the Zfone application remains proprietary. That's why we show you an End User License Agreement before you download Zfone.

Q: Is Zfone covered by any patents?

A: Yes, but don't worry-- we offer a royalty-free patent license to anyone who doesn't sue us. I think software patents stifle innovation and have done a great deal of harm to the software industry, especially in the crypto world. The RSA patent holders wielded their patent to do all they could to destroy PGP in the 1990s. I don't ever want to experience this problem again, so our patent is purely defensive. Having the patent also allows us to better serve the user community's interests by providing leverage to get other ZRTP implementors to abide by ZRTP's back-door-resistant features.

Q: Do you support Elliptic Curve Diffie-Hellman?

A: Yes, we do. Although ZRTP always supports classic Diffie-Hellman, if you license our libZRTP SDK, we offer optional support for the ECDH protocol as defined by NIST SP 800-56A and NSA Suite B, which uses the same elliptic curves as used by ECDSA in FIPS 186-3. Elliptic curve algorithms are the next generation of public key cryptography, offering a level of security that better matches the full key strength of AES-256. The free versions of Zfone do not support ECDH.

Q: Does Zfone work with Skype?

A: No. Skype uses a closed proprietary protocol, which they do not publish. That makes it difficult to design Zfone to work with it. Skype intentionally does not interoperate with the rest of the VoIP industry, which is built on open standards. I decided to follow the industry standards.

There is one other problem with Skype. It uses a Variable-Bit Rate codec, which introduces significant vulnerabilities, regardless of the quality of encryption.

Q: Does Zfone work with plain old telephone service (POTS) phones?

A: Nope. Sorry. It only works with VoIP protocols, not PSTN, or POTS, phones. VoIP is the wave of the future, so I'm not motivated to try to retrofit this to work with the old public switched telephone network. A famous hockey player said "I try to skate to where I think the puck will be."

Q: Can I use Zfone or ZRTP to call regular landline or mobile telephone numbers?

A: This question is not related to Zfone or ZRTP. It just depends your VoIP service. What kind of VoIP service do you have? Does it allow only VoIP-to-VoIP calls, or does it have a PSTN gateway that also allows you to make calls to regular PSTN or mobile phone numbers? If the latter, then yes, you can make such a call. But the call will not be secure. A regular POTS phone cannot support ZRTP. ZRTP only works with VoIP-to-VoIP calls. The whole point of Zfone is to make secure calls. However, there are ZRTP-enabled VoIP clients that happen to run on smart mobile phones, so that would allow a secure VoIP call to a mobile phone.

Q: My VoIP service provider (such as Vonage or AT&T) gave me an ATA (Analog Telephony Adapter), or VoIP router, that allows me to connect my old-fashioned telephone to my broadband connection. Will Zfone work with that?

A: Well, not with that exact setup, no. Your ATA or VoIP router is a hardware device that lets you connect your old analog telephone to a VoIP network. To make a secure call with that kind of setup, you would have to have an ATA with the ZRTP protocol integrated inside, which will happen someday, we hope. In the meantime, if you really want to run Zfone now, you need to run a software VoIP client (such as X-Lite, Apple iChat, SJphone, or perhaps a software VoIP client supplied by your VoIP service provider) on your PC or Macintosh computer. You can use the software VoIP client to connect to your VoIP service provider from your computer, not from a normal telephone. Then you can install Zfone on the same computer, and have it convert your VoIP call to the ZRTP protocol. And, of course, the other party you are calling must also be running VoIP with the ZRTP protocol (such as Zfone) on the other end. This will become simpler when ATA or VoIP router vendors integrate the ZRTP protocol inside their hardware.

Q: Does ZRTP work with the Asterisk PBX? What about IAX?

A: ZRTP has been successfully integrated into Asterisk, an open source PBX server from Digium. This modified version of Asterisk supports ZRTP in SIP/RTP calls. ZRTP has protocol features specially designed to allow a PBX to act as a trusted man-in-the-middle. For more information on ZRTP support for Asterisk SIP/RTP calls, see our Asterisk support page. However, ZRTP and IAX (Inter-Asterisk eXchange protocol) are not well suited for each other in their present forms.

Q: Does ZRTP work with the FreeSWITCH PBX?

A: ZRTP has been integrated into FreeSWITCH, which is another open source PBX. FreeSWITCH is a more elegantly designed and better implemented PBX than Asterisk, despite being less deployed than Asterisk. ZRTP's tight integration into FreeSWITCH is far more extensive than the limited functionality patch that we did for Asterisk. FreeSWITCH supports ZRTP across the full range of FreeSWITCH's features and functionality. FreeSWITCH is the best PBX to showcase how ZRTP can work in a PBX.

Q: Can ZRTP be used in conference calls?

A: Yes. Most conference calls are connected via a conference call mixer or a PBX, connected in a star topology. Think of the conference call mixer as the hub of a spoked wheel, with each spoke connected to a party in the conference call. Each spoke is a separate ZRTP connection, with its own separate session key. The conference call mixer acts as a trusted man-in-the-middle (MiTM) between ZRTP-equipped phones, and performs the audio mixing for the conference call.

Q: Can ZRTP be used with H.323 or other signaling protocols?

A: Yes, ZRTP can be used with any signaling protocol, including SIP, H.323, MGCP, Jingle, and Peer-to-peer SIP. The XMPP Standards Foundation has a protocol extension for "Use of ZRTP in Jingle RTP Sessions", XEP-0262. ZRTP is independent of the signaling layer, because it does all its key negotiations in the media stream. In fact, Zfone now encrypts calls made with the Google Talk VoIP client, which uses Jingle for signaling instead of SIP. ZRTP is the only VoIP encryption protocol with this much flexibility.

Q: Which platforms do you support?

A: Zfone runs on Windows XP and Vista, both 32-bit and 64-bit versions. Zfone also runs on Linux and Mac OS X. Zfone will encrypt audio and video for Apple iChat calls on Mac OS X (Leopard), but not file transfers, text chat, or remote desktop control. Zfone has been tested with these VoIP clients: X-Lite, Gizmo (audio, no video yet), XMeeting, Yahoo Messenger's VoIP client (for audio), Magic Jack, and SJphone. It does not work with Skype. Note that Gizmo's VBR codec creates some security vulnerabilities that Zfone cannot fix.

The Zfone libZRTP SDK runs on more platforms than the Zfone application. It has been used on Windows XP, Vista, Mac OS X, Linux, Windows Mobile, and Symbian. Our SDK is written in C. We don't have a Java version at this time.

Q: Do both parties need Zfone (or products that use the Zfone protocol) to make a secure call?

A: I actually get asked this question more often than one might imagine, which is why I'm finally adding this question to this FAQ. OK, try to think this one through. Do both parties need a fax machine to send a fax? Do both parties need a phone to make a phone call? Do both parties need to speak Navajo to have a conversation in Navajo? I'm confident you'll suss out the answer.

Q: How do ZRTP's key continuity features compare with SSH?

A: The key continuity features of ZRTP are analogous to those provided by SSH, but they differ in one respect. SSH caches public signature keys that never change, and uses a permanent private signature key that must be guarded from disclosure. If someone steals your SSH private signature key, they can impersonate you in all future sessions and mount a successful man-in-the-middle (MiTM) attack any time they want.

ZRTP caches symmetric key material that is mixed into the next session's secret session key, which changes with each session. If someone steals your ZRTP shared secret cache, they only get one chance to mount a MiTM attack, in the very next session. If they miss that chance, the retained shared secret is refreshed with a new value, and the window of vulnerability heals itself, which means they are locked out of any future opportunities to mount a MiTM attack. This gives ZRTP a "self-healing" feature if any cached key material is compromised.

A MiTM attacker must always be in the media path. This presents operational difficulties for the attacker in many VoIP usage scenarios, because being in the media path for every call is often harder than being in the signaling path. This creates coverage gaps in the attacker's opportunities to mount a MiTM attack. ZRTP's self-healing key continuity features are better than SSH at exploiting any temporary gaps in MiTM attack coverage. Thus, ZRTP quickly recovers from any disclosure of cached key material.

In systems that use a persistant private signature key, such as SSH, the stored signature key is usually protected from disclosure by encryption that requires a user-supplied high-entropy passphrase. This arrangement may be acceptable for a diligent user with a desktop computer sitting in an office with a full ASCII keyboard. But it would be prohibitively inconvenient and unsafe to type a high-entropy passphrase on a mobile phone's numeric keypad while driving a car. Users will reject any scheme that requires the use of a passphrase on such a platform. Which means mobile phones carry an elevated risk of compromise of stored key material, and thus would especially benefit from the self-healing aspects of ZRTP's key continuity features.

The infamous Debian OpenSSL weak key vulnerability (discovered and patched in May 2008) offers a real-world example of why ZRTP's self-healing scheme is a good way to do key continuity. The Debian bug resulted in the production of a lot of weak SSH keys, which continued to compromise security even after the bug had been patched. In contrast, ZRTP's key continuity scheme adds new entropy to the cached key material with every call, so old deficiencies in entropy are washed away with each new session.

There is one more benefit to ZRTP's form of key continuity. It confers a measure of immunity from any future breakthroughs in quantum computing that may undermine the strength of public key algorithms such as Diffie-Hellman. I think the fears about quantum computers are overblown, because of severe practical difficulties in building effective quantum computers. Nonetheless, let's assume that someday quantum computers of the required complexity can somehow be built. Despite this, symmetric key algorithms with key lengths of 256 bits remain secure against quantum computers. This means ZRTP's symmetric key continuity can save the day, provided the wiretapper was not present in the first call. That's because the wiretapper does not know the shared secret inherited from the first call that will be mixed into the session keys of subsequent calls. In which case he cannot compute the new session key, even if he can break Diffie-Hellman.

This also implies that you can use AES-256 with 3072-bit Diffie-Hellman keys, and still get the full benefit of the strength of the 256-bit AES key. This would seem to be at variance with NIST guidelines which recommend using AES-128 with DH-3072, because the work factor for breaking both of them is about equal. But if ZRTP can mix in an additional 256 bits of shared secret entropy from earlier sessions when computing the new session key, the strength of the result can exceed the strength of DH-3072. This assumes the wiretapper was not able to observe the first session.

Q: Isn't it a protocol layer violation to do the key management in the media instead of in the signaling?

If I want to speak Navajo
with my friend on the phone,
I shouldn't have to clear it first
with the phone company.

A: Some proponents of other VoIP encryption schemes say that it offends their sensibilities to see ZRTP negotiate the cryptographic keys in the media stream, instead of in the signaling layer, as other VoIP encryption schemes do. They call it a "layer violation". But to me (and to a number of other protocol designers I've talked to), it seems clear that the signaling should take care of its own key negotiation for signaling authentication, and the media layer should negotiate its own keys for media encryption. The two layers should each take care of their own cryptographic needs. If anything, doing the media encryption key negotiation in the signaling layer is the real layer violation.

In the same vein, I don't feel that the VoIP service providers can always be trusted to act with my interests in mind, so I don't want to involve their SIP servers in my encryption key negotiations. If I want to speak Navajo with my friend on the phone, I shouldn't have to clear it first with the phone company. It's just none of their business. And that's part of what makes ZRTP so broadly appealing.

It's also worth noting that traditional secure telephones in the PSTN world, such as the AT&T TSD-3600 or the STU-III, did all their key negotiations in the media stream. They used a modem to establish a digital channel on a normal voice grade phone line, negotiated their keys, and sent an encrypted voice stream all on the same channel. No one called it a layer violation. This is the way secure phones always worked before VoIP came along.

Q: Many VoIP clients include some form of built-in text chat or instant messaging. Does Zfone or ZRTP encrypt those text messages?

A: No. Not yet, anyway. ZRTP does very well by limiting its mission to just managing the keys and encrypting RTP media streams for VoIP. Also, these instant text messaging protocols come in a number of different variants, such as AIM or Jabber, with different VoIP clients supporting different instant messaging protocols. Each of them will require a different method of encryption, and that remains to be worked out. Some methods already exist for encrypting some forms of text chatting, such as the one offered by PGP Corp. We're looking at the most appropriate methods to add this capability to Zfone.

Q: Does Zfone have any "back doors"?

A: Anyone who knows anything about me knows the answer is No. In fact, I have a whole page on that subject regarding PGP software, and it applies equally to Zfone. Now, having said that, I remind you that Zfone is still beta software, and has a few bugs. Until I do a real release, I make no claims about it being secure. We haven't finished our internal code reviews, and we may yet discover bugs that affect security. That's also why we have a public beta. We need you to help us test the code.

A good rule of thumb about how you can tell if there is no back door in a product: Do they publish their source code? I do. I publish the source code for Zfone, just as I did with PGP. You can inspect it yourself, and build the executable from the source code yourself. I don't feel comfortable trusting a crypto product if they don't publish their source code.

It's easy to keep back doors out of your own product, as I do with Zfone. It's much harder to keep back doors out of other vendors' implementations of the ZRTP protocol. Nonetheless, I decided to at least make an attempt. Take a look at these ideas for ZRTP's back-door-resistant features.

Q: Has anyone done any real security analysis on Zfone or ZRTP?

A: Yes. Andy Clark's security analysis company, Detica Forensics, did a report in January 2008, available here as a PDF file: Forensic Analysis of Zfone. We stood up pretty well in this report.

Riccardo Bresciani at Trinity College in Dublin has also done a formal security analysis of ZRTP, using some special purpose security protocol analysis tools. His report The ZRTP Protocol - Analysis on the Diffie-Hellman Mode (PDF) concludes "The analysis performed on the protocol has formally proven that ZRTP is a safe key agreement protocol".

Q: Will Zfone's protocol pass FIPS-140 validation?

A: ZRTP-based products must receive FIPS-140 validation in order to be sold to US government customers, or to the governments of some other countries. To facilitate this, ZRTP uses only FIPS-approved algorithms in all relevant categories. To meet the FIPS-140 validation requirements set by NIST FIPS PUB 140-2 Annex A and NIST FIPS PUB 140-2 Annex D, ZRTP is compliant with:

NIST SP 800-56A - Recommendation for Pair-Wise Key Establishment Schemes Using Discrete Logarithm Cryptography
NIST SP 800-108 - Recommendation for Key Derivation Using Pseudorandom Functions
NIST FIPS PUB 198-1 - The Keyed-Hash Message Authentication Code (HMAC)
NIST FIPS PUB 180-3 - Secure Hash Standard (SHS)
NIST SP 800-38A - Recommendation for Block Cipher Modes of Operation
NIST FIPS PUB 197 - Advanced Encryption Standard (AES)
NSA Suite B Cryptography

Q: Is the Short Authentication String (SAS) vulnerable to an attacker with voice impersonation capabilities?

A: In practical terms, no. It is a mistake to think this is simply an exercise in voice impersonation (perhaps this could be called the "Rich Little" attack). Although there are digital signal processing techniques for changing a person's voice, that does not mean a man-in-the-middle attacker can safely break into a phone conversation and inject his own short authentication string (SAS) at just the right moment. He doesn't know exactly when or in what manner the users will choose to read aloud the SAS, or in what context they will bring it up or say it, or even which of the two speakers will say it, or if indeed they both will say it. In addition, some methods of rendering the SAS involve using a list of words, notably the PGP word list, in a manner analogous to how pilots use the NATO phonetic alphabet to convey information. This can make it even more complicated for the attacker, because these words can be worked into the conversation in unpredictable ways. If the session also includes video, the MiTM may be further deterred by the difficulty of making the lips sync with the voice-spoofed SAS. Remember that the attacker places a very high value on not being detected, and if he makes a mistake, he doesn't get to do it over.

To further reduce the liklihood of a voice impersonation attack, we recommend that both parties should verbally repeat the SAS, if they feel that the call is likely to invite the attention of an especially resourceful opponent who is willing to take risks. We also recommend that if the user interface permits, the SAS should be rendered via the PGP word list, instead of using base-32 digits.

Some people have raised the question that even if the attacker lacks voice impersonation capabilities, it may be unsafe for people who don't know each other's voices to depend on the SAS procedure. This is not as much of a problem as it seems, because it isn't necessary that they recognize each other by their voice, it's only necessary that they detect that the voice used for the SAS procedure matches the voice in the rest of the phone conversation.

Q: Does Zfone and ZRTP encrypt Touch-Tone keypad DTMF tones?

A: Yes. ZRTP encrypts all RTP traffic, including Touch-Tone keypad DTMF tones. DTMF tones are carried in the RTP media stream using methods defined by RFC 2833, embedded as special RTP payload types. We encrypt these along with the rest of the RTP media stream, which is important because people use DTMF tones to enter their credit card numbers when they call their bank, for example.

There is a potential but unlikely problem with DTMF handling that has never been reported in Zfone. In unusual cases it is possible to send DTMF over the SIP channel. Some very old, non-standard SIP clients send it using a SIP INFO method - there is no RFC for this and it is discouraged strongly by the SIP standards community. There is also a new SIP extension (RFC 4730) known as KPML (Keypress Markup Language) which can be used to send DTMF over a SIP NOTIFY - but very few VoIP clients implement this yet, and the ones that do don't seem to be using it, and can be easily configured to not use it. If you ever find a VoIP client that uses KPML, we recommend that you simply disable this feature and allow DTMF to be carried the traditional way in the RTP media stream. Note that all RTP media encryption protocols, not just ZRTP, would be equally affected by this problem if SIP is used to carry DTMF.

There was a problem with an old version of the Zfone beta software regarding the encryption of DTMF tones. Forbes magazine had an article on 2 August 2007 that reported a problem which they attributed to ZRTP's handling of DTMF tone encryption. In fact this was not due to any deficiencies in the ZRTP protocol, but was due to a software bug in the Zfone beta software that existed in April 2007. That bug was fixed before the article appeared. The bug only happened when Zfone was used in conjunction with SJLabs' SJphone, triggered by a subtle interaction with a bug in SJphone that was improperly generating DTMF packet sequences. Current versions of Zfone always encrypt DTMF tones from all VoIP clients, including SJphone.

Q: What if I use a Variable Bit Rate (VBR) codec? Won't that leak information?

A: Johns Hopkins University researchers have observed that when voice is compressed with a variable bit-rate (VBR) codec, the packet lengths vary depending on the types of sounds being compressed. This leaks a lot of information about the content even if the packets are encrypted, regardless of what encryption protocol is used. We strongly recommend that you avoid using VBR codecs if you want to make a secure phone call. Most codecs are not VBR, so it's not hard to avoid using VBR. If you plan to use Zfone with a VoIP client, open the preference panel in the VoIP client and disable the VBR codecs from the menu.

Some codecs have a VBR mode and a non-VBR mode, so you should disable the VBR mode. If you are the implementor of a VoIP client, you can disable the VBR feature while still allowing the codec to be used. But if you are the end user, most VoIP clients do not have preference panels that allow such fine granularity of control, so you will be lucky to be able to just disable the whole codec. Some VoIP clients don't allow the user any choice at all about what codecs are used.

Some safe non-VBR codecs include GSM 6.10, iLBC, G.711 (A-LAW, u-LAW, and PCMU), G.722, and G.726. It's not a problem if the codec adapts the bit rate to the available channel bandwidth. The dangerous codecs are the ones that change their bit rate depending on the type of sound being compressed.

Skype's VBR codec leaks information
regardless of the quality of the
encryption, which may allow phrases to
be identified with an accuracy of 50-90%.

Let me be clear about this leakage of information-- it doesn't leak any cryptographic key material, and it doesn't help the attacker actually break the crypto. The VBR codec is leaking information about the content of the voice packets, because some sounds compress more than other sounds. By looking at how much each packet of sound was compressed, which can be inferred by the packet size, it is possible to infer something about what kind of sound it is, like a vowel, or a sharp consonant. This undermines the usefulness of the encryption. Some phrases can be identified with an accuracy of 50% to 90%. This is a serious vulnerability.

Fortunately, not too many codecs use VBR. Speex has a VBR-capable codec, and some VoIP applications that use Speex allow the user to choose which codecs to enable. iSAC is a commercially licensed VBR codec, used by Skype, Google Talk, and Gizmo. This means that Skype is vulnerable to VBR leakage regardless of the quality of Skype's built-in crypto. Sadly, this also means Gizmo and Google Talk are not the safest VoIP clients to use with Zfone. And it appears the user cannot disable the use of this codec in those products. Microsoft's RT Audio also appears to be a VBR codec, and is used in Microsoft Office Communicator.

It also appears that voice activity detection (VAD) leaks information about the content of the conversation, but to a far lesser extent than VBR. This effect can be mitigated by lengthening the VAD "hangover time" by about 1 to 2 seconds. That would sharply reduce the information leakage, but it may be something that only the VoIP application developer can do, if the VAD parameters are tunable. For an end user, a simpler solution would be to avoid the use of VAD, if this is feasible in your situation. Examples of codecs that use VAD include AMR and G.722.2. If it's not convenient to avoid all VAD codecs, keep in mind that the leakage from VAD is much less than the leakage from VBR.

Some researchers have suggested that the VAD hangover time should be lengthened by a random amount. For example, a random normal distribution over the range of 1 to 2 seconds. Most codecs that use VAD only allow a fixed amount of VAD hangover to be easily configured. It remains unclear whether a random hangover time is worth the extra effort. This requires further research.

Q: Why do I have to register in order to download Zfone?

A: Although the US has ended most of its export controls for crypto software, there are still some reasonable residual export controls in place, namely, to prevent the software from being exported to a few embargoed nations, such as North Korea, Iran, Libya, Syria, and Sudan. And for commercial encryption software that you actually pay for (which does not include this free public beta), there are now requirements to check customers against government watch lists as well, which is something that companies such as PGP comply with these days. PGP Corp graciously volunteered to host the public beta software on their server, with all the appropriate checks in place, which means Zfone users are screened exactly the same way as PGP customers.

The Zfone registration page checks your IP address against the list of embargoed countries, then emails you a link that you must click on to start your download, and checks your IP address again when you follow that link, which presumably means you did not receive your email in an embargoed country, and that the download itself did not go to an embargoed country. The U.S. Government deems this as adequate evidence that we made our best efforts to comply with U.S. export laws. Staying out of that kind of trouble is important to me. Been there. Done that.

Q: If I integrate your SDK into my VoIP application, will I have to worry about U.S. export controls?

A: Well, yes, but it's pretty easy to deal with these days. If you plan to export your product from the U.S., you will have to file some papers with the U.S. Commerce Department. I did that for Zfone, and you can see the results of that here. I recommend you use Roszel Thomsen, at the law firm of Thomsen and Burke LLP.

Q: Does Zfone protect against "social network analysis" and other forms of analysis based on traffic patterns?

A: No, not at all. Zfone just encrypts the contents of the call. The only way to protect against traffic analysis is to go through multiple intermediaries, which is a technique that has been used to protect email and web browsing (see the TOR project for an example of this). But this adds latency to communications, which may be unnoticeable for email, and at least tolerable for web browsing, but would be unacceptable for phone calls. Further, these countermeasures may be ineffective against a clever and resourceful opponent, because it's hard to hide the timing and length of the messages, especially if there are real-time communication requirements.

Q: Why is it called ZRTP?

A: Well, I'd rather not have to use the unwieldy acronym for Media Path Key Agreement for Secure RTP. It was Alan Johnston who suggested ZRTP, because it negotiates the session keys for SRTP, Secure Real-time Transport Protocol (RFC 3711). We just used a regrettably less descriptive mutation of that name, incorporating my last initial. In defense of my immodesty, it's worth noting that in the crypto community, eponymous crypto protocols are very much the norm. Examples include RSA, Diffie-Hellman, ElGamal, CAST, RC2 and RC4 (Ron's Code), Blakely-Shamir, and many others.

Alan's original rationale for the name ZRTP also was based on the fact that ZRTP was originally (in its first couple of Internet drafts) based on adding header extensions to RTP packets, so ZRTP was really a variant of RTP. We have since changed the packet format to no longer encapsulate the ZRTP packets inside RTP header extensions, but to make them into a separate packet format that is distinguishable from RTP. In view of that change, ZRTP is now a pseudo-acronym.

Q: Does Zfone or ZRTP slow down the VoIP call?

A: No, not during the conversation. At the start of the call it takes a second or two to negotiate the cryptographic keys, but after that, it's smooth sailing. The AES encryption is much faster than the speech compression that your VoIP client already performs. You won't notice any additional delay.

Q: Why does Zfone show the IDLE status during some VoIP calls?

A: Some VoIP clients attempt to traverse NAT routers by sending RTP voice and video packets through TCP instead of UDP. This protocol tunneling violates the IETF standards for VoIP, which require that RTP media packets be sent over UDP. Zfone assumes that RTP will be found only in UDP packets, and thus will not detect RTP sent through TCP. In that case, Zfone's GUI displays the "Idle" status during a call, and does not engage the ZRTP protocol. Sometimes the packets are going through a media relay which converts them to UDP for the other party, whose Zfone client can therefore see the media stream, but searches in vain for the idled ZRTP peer and displays the "NOT Secure / No ZRTP Peer" status.

If this happens, here are a couple of workarounds: 1) The best solution is to move one of the parties' computers (in particular, the one that displays IDLE) off their local network to an external IP address, thereby simplifying the NAT traversal problem. Even better, move both computers to external IP addresses. 2) Or it might help to switch one of the parties (especially the IDLE one) to a different VoIP client. Often the VoIP client software decides to straighten up and follow the standards when talking to a VoIP client from another vendor.

Any form of protocol tunneling will subvert Zfone's RTP detection mechanism. In fact, most protocol tunneling is done to defeat various packet filtering mechanisms, such as firewalls. This does not indicate a problem with the ZRTP protocol. It's related to trying to run the ZRTP protocol as a packet filter in the IP stack, as Zfone does. It's a problem that would go away completely if the ZRTP protocol were integrated inside a VoIP client, for example by using our Zfone SDK.

Q: How does ZRTP verify the identity of who you call?

A: It doesn't. It doesn't even try. It's not necessary to verify the identity of the other party to establish a secure call. In a normal PSTN phone call, what happens if you call someone's number, and his wife answers the phone? Do you sound a klaxon horn and blow a fuse? No. You use your brain to figure it out. That's just how phones work, and it's no big deal. It's certainly no reason to fail to make a secure connection. The most important wiretapping vulnerability is a Man-in-the-Middle (MiTM) attack, which ZRTP guards against by using either a short authentication string, or key continuity, or both.

Of course, it helps if you know the identity of the caller before you answer the phone, like the Caller ID in the PSTN. The SIP protocol attempts to address that problem in the signaling. It is a different problem, and certainly worthy of attention, but it is not the job of ZRTP. ZRTP does not begin until after the user answers the phone and the call is underway. ZRTP merely establishes a secure wiretap-resistant connection to another ZRTP endpoint, and does it very well by narrowing the scope of its mission.

I don't know why so many people get hung up on this question of making an "authenticated phone call". It's a hard problem, and not worth the effort, in my opinion. Most phones, both at home and at work, are used by more than one person. And many people use other people's phones on a regular basis. There is not a one-to-one relationship between people and phones. Then there is the problem of establishing a digital identity. We could have a complex bureaucracy create a public key infrastructure that issues a certificate that we can attach to my phone, which can be displayed by your phone. Not only is that of questionable value in my opinion, but it's also hard. A number of clever people, including my friend Carl Ellison, have written about the complexity of creating unambiguous unique names and attaching them to people.

It's a mistake to view the world through a radar screen-- you must also use your eyes. And your common sense. The ZRTP protocol cannot tell you the name on the birth certificate of the person you are talking to. Or that the person you are talking to is telling the truth. And it cannot tell you if the other "endpoint" is then forwarding the call to another device. But neither can anything else.

The Zfone™ Project

Frequently Asked Questions

The Zfone^™ Project