Frequently Asked Questions
Zfone's principal designer Phil Zimmermann answers a few questions on Zfone and ZRTP. Some of these answers are aimed at potential OEM customers in the VoIP industry who are familiar with VoIP jargon.
- What is Zfone?
- Can you show me a live demo?
- Why do we need Zfone? For that matter, why do we even need secure VoIP at all?
- Why is Zfone better?
- Will Zfone's protocol become an IETF standard?
- Is Zfone and ZRTP CALEA compliant?
- Will the government attempt to stop VoIP encryption?
- Can ZRTP use a PKI (public key infrastructure)?
- Exactly how does Zfone and ZRTP protect against a man-in-the-middle (MiTM) attack?
- ZRTP vs. other protocols:
- I'm a VoIP developer. How can I get the ZRTP protocol implemented in my VoIP application?
- Is Zfone open source?
- Is Zfone covered by any patents?
- Do you support Elliptic Curve Diffie-Hellman?
- Does Zfone or ZRTP work with ...? :
- Does Zfone work with Skype?
- Does Zfone work with plain old telephone service (POTS) phones?
- My VoIP service provider (such as Vonage or AT&T) gave me an ATA (Analog Telephony Adapter), or VoIP router, that allows me to connect my old-fashioned telephone to my broadband connection. Will Zfone work with that?
- Does ZRTP work with the Asterisk PBX? What about IAX?
- Can ZRTP be used with H.323 or other signaling protocols?
- Which platforms do you support?
- Isn't it a protocol layer violation to do the key management in the media instead of in the signaling?
- Many VoIP clients include some form of built-in text chat or instant messaging. Does Zfone or ZRTP encrypt those text messages?
- Does Zfone have any "back doors"?
- Is the Short Authentication String (SAS) vulnerable to an attacker with voice impersonation capabilities?
- Has anyone done any real security analysis on Zfone or ZRTP?
- Does Zfone and ZRTP encrypt Touch-Tone keypad DTMF tones?
- Why is it called ZRTP?
- Why do I have to register in order to download Zfone?
- If I integrate your SDK into my VoIP application, will I have to worry about U.S. export controls?
- Does Zfone protect against "social network analysis" and other forms of analysis based on traffic patterns?
- Why does Zfone show the IDLE status during some VoIP calls?
- How does ZRTP verify the identity of who you call?
A: Zfone is my new secure VoIP phone software which lets you make secure encrypted phone calls over the Internet. The ZRTP protocol used by Zfone will soon be integrated into many standalone secure VoIP clients, but today we have a software product that lets you turn your existing VoIP client into a secure phone. The current Zfone software runs in the Internet protocol stack on any Windows XP, Mac OS X, or Linux PC, and intercepts and filters all the VoIP packets as they go in and out of the machine, and secures the call on the fly. You can use a variety of different software VoIP clients to make a VoIP call. The Zfone software detects when the call starts, and initiates a cryptographic key agreement between the two parties, and then proceeds to encrypt and decrypt the voice packets. It has its own little separate GUI, telling the user if the call is secure. It's as if Zfone were a "bump on the cord", sitting between the VoIP client and the Internet. Think of it as a bump in the protocol stack.
Q: Can you show me a live demo?
A: Funny you should ask. Here is a presentation and demo I did at the DEFCON conference in August 2007.
Q: Why do we need Zfone? For that matter, why do we even need secure VoIP at all?
for the PSTN, we will absolutely
need to protect it, or organized crime
will be attacking it as intensively as
they attack the rest of the Internet today.
A: As VoIP grows into a replacement for the PSTN, we will absolutely need to protect it, or organized crime will be attacking it as intensively as they attack the rest of the Internet today. VoIP is far more vulnerable to interception than the PSTN. A PC on your office network can unknowingly host spyware that can intercept your corporate VoIP calls and store and organize them on a hard disk for convenient browsing by criminals half a world away, giving them trade secrets and insider trading opportunities. To see an example of an actual implementation of this kind of spyware, take a look at Peter Cox's SIPtap demo..
The Internet is not a safe medium to carry our phone calls. But Zfone solves these problems. This technology has social benefits. It has the power to change our lives, enabling us to have a private conversation any time we want with anyone, anywhere - without buying a plane ticket.
A: The ZRTP protocol has some nice cryptographic features lacking in many other approaches to VoIP encryption. Although it uses a public key algorithm, it does not rely on a public key infrastructure (PKI). In fact, it does not use persistent public keys at all. It uses ephemeral Diffie-Hellman with hash commitment, and allows the detection of man-in-the-middle (MiTM) attacks by displaying a short authentication string for the users to read and verbally compare over the phone. It has perfect forward secrecy, meaning the keys are destroyed at the end of the call, which precludes retroactively compromising the call by future disclosures of key material. But even if the users are too lazy to bother with short authentication strings, we still get fairly decent authentication against a MiTM attack, based on a form of key continuity. It does this by caching some key material to use in the next call, to be mixed in with the next call's DH shared secret, giving it key continuity properties analogous to SSH. All this is done without reliance on a PKI, key certification, trust models, certificate authorities, or key management complexity that bedevils the email encryption world. It also does not rely on SIP signaling for the key management, and in fact does not rely on any servers at all. It performs its key agreements and key management in a purely peer-to-peer manner over the RTP packet stream. And it supports opportunistic encryption by auto-sensing if the other VoIP client supports ZRTP.
ZRTP doesn't need a PKI, and there are good reasons why it's a mistake to require a PKI for secure VoIP. Nonetheless, ZRTP can use a PKI if you already have one up and running. Follow this link for how this is done.
In July 2005 at the Black Hat briefings, Zfone's protocol was introduced and it brought some important innovations. Although they have been used in other forms in the past for other environments, such as PSTN encryption or remote terminal logins, they had never before been applied specifically to negotiate session keys for Secure RTP media streams:
- First protocol to use a short authentication string (SAS) to detect a MiTM attack, calculated in such a way as to constrain the MiTM to one guess to generate the correct SAS in his attack, which means the SAS can be quite short. PGPfone did this, but not for keying Secure RTP.
- First protocol to use the principle of key continuity specifically for negotiating keys for Secure RTP media streams. SSH has done this for other unrelated applications.
- First protocol to do its key negotiations in the media path. PGPfone did this, but not for keying Secure RTP.
- First protocol to do "best effort encryption" (or opportunistic encryption) to negotiate keys for Secure RTP, which means it can gracefully auto-sense if the other VoIP peer supports the same protocol, without disrupting a peer that lacks support for it, enabling immediate deployment.
Q: Will Zfone's protocol become an IETF standard?
A: Alan Johnston, Jon Callas, and I have submitted an IETF Internet Draft for the ZRTP protocol, which is used by Zfone to set up the cryptographic key agreement. Alan co-authored RFC 3261 which defines the SIP standard, and Jon is CTO at PGP Corp.
You can view the current state the ZRTP Internet Draft here.
Q: Is Zfone and ZRTP CALEA compliant?
in the key negotiation, and CALEA
does not apply to end users.
A: Zfone's architecture likely renders that question irrelevant. I'm not a lawyer, but it's my understanding that the Communications Assistance for Law Enforcement Act applies in the US to the PSTN phone companies and VoIP service providers, such as Vonage. CALEA imposes requirements on VoIP service providers to give law enforcement access to whatever they have at the service provider, which would be only encrypted voice packets. ZRTP does all its key management in a peer-to-peer manner, so the service provider does not have access to any of the keys. Only Zfone's end users are involved in the key negotiation, and CALEA does not apply to end users. If the VoIP service providers are smart, they will welcome ZRTP as a solution to being caught in the middle between the end users and the government. In the early 1990s, the government tried to control the end user's use of crypto by introducing the Clipper chip. That didn't go over too well politically, and had to be abandoned. The government will find it difficult to try again to stop end users from encrypting their traffic, regardless of whether that traffic is email, e-commerce web transactions, or VoIP calls.
However, there is one usage scenario for the ZRTP protocol that may be subject to CALEA. Consider the case of a VoIP service provider that operates a PBX (such as Asterisk) or conference call mixer for its customers. The service provider can implement the ZRTP protocol in the PBX, and this PBX can terminate the ZRTP media connections for both parties, acting as a trusted man-in-the-middle between two ZRTP-equipped end users, or act as a conference call mixer for several users. In this case, the PBX or conference bridge would be in a physical position to provide law enforcement access to either ZRTP or SRTP key material or actual plaintext media traffic, and thus be subject to CALEA. The usual end-to-end nature of ZRTP would be subverted. If your threat model includes this scenario, you may want to try to arrange direct end-to-end ZRTP connections whenever possible.
Q: Will the government attempt to stop VoIP encryption?
organized crime will be able to
wiretap prosecutors and judges,
revealing details of ongoing
investigations, and conversations
with their wives about what time
to pick up their kids at school.
A: It's a fair question to ask in a post-9/11 world. Just how likely would it be for the government to restrict the end user's use of secure VoIP? The question of whether strong cryptography should be restricted by the government was debated all through the 1990's. This debate had the participation of the White House, the NSA, the FBI, the courts, the Congress, the computer industry, civilian academia, and the press. This debate fully took into account the question of terrorists using strong crypto, and in fact, that was one of the core issues of the debate. Nonetheless, society's collective decision (over the FBI's objections) was that on the whole, we would be better off with strong crypto, unencumbered with government back doors. The export controls were lifted and no domestic controls were imposed. This was a good decision, because we took the time and had such broad expert participation. The 9/11 attacks did not change the wisdom of that collective decision, and although civil liberties on the whole have eroded since then, we haven't lost our right to use strong crypto.
The law enforcement community will be understandably concerned about the effects encrypted VoIP will have on their ability to perform lawful intercepts. But what will be the overall effects on the criminal justice system if we fail to encrypt VoIP? Historically, law enforcement has benefited from a strong asymmetry in the feasibility of government or criminals wiretapping the PSTN. As we migrate to VoIP, that asymmetry collapses. VoIP interception is so easy, organized crime will be able to wiretap prosecutors and judges, revealing details of ongoing investigations, names of witnesses and informants, and conversations with their wives about what time to pick up their kids at school. The law enforcement community will come to recognize that VoIP encryption actually serves their vital interests.
Q: Can ZRTP use a PKI (public key infrastructure)?
A: Yes. The ZRTP protocol does have the optional capability to use a PKI if you already have a PKI up and running. But ZRTP does not actually require a PKI.
VoIP protocol require a PKI.
There are major problems and
complexities with building,
maintaining, and relying on PKI.
First let's review some good reasons why it's a mistake to make a secure VoIP protocol require a PKI. There are major problems and complexities with building, maintaining, and relying on PKI. That's why in the 1990s, a number of companies died trying to build and market PKI technology. See Ellison and Schneier's paper Ten Risks of PKI: What You're Not Being Told About Public Key Infrastructure and Ellison's paper Improvements on Conventional PKI Wisdom. In the email encryption world, a PKI architecture was the kiss of death for PEM and MOSS, both of which were swept aside by PGP. This also led to S/MIME never reaching critical mass, despite its advantage of being bundled in Microsoft's products. PGP became the dominant email encryption standard because it did not depend on a centrally managed PKI. Nonetheless, if you feel you must use a PKI and already have one, here's how ZRTP can make use of it.
The ZRTP Internet Draft describes how ZRTP can use a PKI-backed digital signature key to sign the short authentication string (SAS) in the ZRTP CONFIRM packet, to reduce reliance on users verbally comparing them during the call. Organizations that feel comfortable with PKIs can still get what they want. Thus, ZRTP offers all of the advantages of a protocol that can use a PKI, without actually becoming dependent on a PKI for security.
There is another way for a ZRTP implementation to benefit from a PKI, without becoming dependent on one. The IETF plans to someday add integrity protection to the delivery of SIP information, and that integrity protection will rely on a PKI. If this ever happens, ZRTP has protocol features that can leverage an integrity-protected SIP layer to provide integrity protection for ZRTP's Diffie-Hellman exchange in the media layer. Which thus confers protection against a man-in-the-middle (MiTM) attack, without requiring the users to verbally compare the SAS.
Q: Exactly how does Zfone and ZRTP protect against a man-in-the-middle (MiTM) attack?
A: The Diffie-Hellman key exchange by itself does not provide protection against man-in-the-middle (MiTM) attacks. To authenticate the key exchange, ZRTP uses a Short Authentication String (SAS), which is essentially a cryptographic hash of the two Diffie-Hellman values. The SAS value is rendered to both ZRTP endpoints. To carry out authentication, this SAS value is read aloud to the communication partner over the voice connection. If the values on both ends do not match, it indicates the presence of a man-in-middle attack. If they do match, there is a high probability that no man-in-the-middle is present. The use of hash commitment in the DH exchange constrains the attacker to only one guess to generate the correct SAS in his attack, which means the SAS can be quite short. A 16-bit SAS, for example, provides the attacker only one chance out of 65536 of not being detected.
ZRTP provides a second layer of authentication against a MiTM attack, based on a form of key continuity. It does this by caching some hashed key material to use in the next call, to be mixed in with the next call's DH shared secret, giving it key continuity properties analogous to SSH. If the MiTM is not present in the first call, he is locked out of subsequent calls. Thus, even if the SAS is never used, most MiTM attacks are stopped, because they weren't present in the first call.
ZRTP provides yet a third layer of protection against a MiTM attack. The IETF plans to add integrity protection to the delivery of SIP information, and that integrity protection will rely on a PKI. When this eventually deploys, ZRTP can take advantage of this. See the ZRTP Internet Draft on how ZRTP can leverage an integrity-protected SIP layer to provide integrity protection for ZRTP's Diffie-Hellman exchange in the media layer. This protects against a MiTM attack, without requiring the users to verbally compare the SAS. However, no VoIP clients yet offer a fully implemented SIP stack that provides end-to-end integrity protection for the delivery of SIP information. Thus, real-world implementations of ZRTP endpoints will continue to depend on SAS authentication for quite some time. Even after there is widespread availability of SIP products that offer integrity protection, many users will still be faced with the fact that the signaling path may be controlled by institutions that do not have the best interests of the end user in mind. In those cases, ZRTP's built-in SAS authentication will remain the gold standard for the prudent user. That, plus the key continuity features.
Q: What about DTLS-SRTP? Why not use that?
A: DTLS-SRTP is a derivative of TLS, a protocol designed for web browsers to talk to web servers, using a centrally managed PKI. A client-server protocol like TLS can work well in a client-server environment, but a phone call between two human beings is an ad-hoc peer-to-peer relationship, and the cryptographic key negotiations should reflect that. Instead of recycling a client-server protocol, ZRTP is purpose-built for VoIP. All these cryptographic protocols have a goal of negotiating keys in a way that stops man-in-the-middle (MiTM) attacks. To accomplish this, ZRTP doesn't need a PKI, and we don't need help from servers controlled by the phone company. Instead, ZRTP has two humans verbally compare a short authentication string to detect if there is a MiTM. Human beings can readily see if there is a MiTM by direct evidence and common sense. ZRTP harnesses the immense resources at both endpoints, which each have a brain with a hundred trillion synapses and the unique power of human intuition.
DTLS-SRTP tries to repurpose itself to VoIP's peer-to-peer environment, but it cannot escape its client-server roots, and that's why it depends so completely on the SIP servers to secure the connection. DTLS-SRTP's MiTM protection collapses in the absence of end-to-end integrity protection in the SIP layer. The only mechanism for this in SIP (besides S/MIME which has been around for 6 years without any implementation) is Enhanced SIP Identity (RFC 4474). However, it turns out that if you are using your SIP phone to call a regular phone number, then RFC 4474 doesn't provide integrity protection, and MiTM protection for DTLS-SRTP collapses. Why? Because for a regular phone number, the SIP identity is of the form sip:+13145551212@example.com asserted by example.com. A MiTM can just remove the RFC 4474 signature, change the a=fingerprint, then re-sign the identity as sip:+13145551212@example2.com asserted by example2.com. How does the callee know that this phone number is actually originating from example.com and not example2.com? There is no way to tell, hence, DTLS-SRTP has no protection from MiTM attacks. Regular phone numbers will be commonly used as identifiers for SIP phone customers for a long time, so this will continue to be a major security weakness for DTLS-SRTP.
Even if this problem with regular phone numbers is somehow solved, we are still left with the elephant in the room that in the final analysis the security of DTLS-SRTP requires a PKI. The PKI dependency will either be contained within DTLS-SRTP itself or within SIP, because of the DTLS-SRTP dependency on SIP end-to-end integrity. All SIP end-to-end integrity mechanisms require a PKI, and all the complexity and bureaucracy that implies. Many years of experience in the crypto industry leads us to believe that PKI is an inappropriate approach to achieving media security in VoIP.
Although far less important than the aforementioned pachyderm, here's another strike against this protocol: DTLS-SRTP must bear the additional cost of a signature calculation of its own, in addition to the signature calculation the SIP layer uses to achieve its integrity protection. ZRTP needs no signature calculation of its own to leverage the signature calculation carried out in the SIP layer. This may be relevant in low-power mobile platforms, or in highly loaded servers.
Q: My VoIP phone already uses SDES to negotiate keys. Is that safe?
A: Good heavens, no! While most VoIP phones don't encypt their calls at all, a few of them have implemented SDES (RFC 4568) to negotiate SRTP session keys to encrypt the call. Of all the methods the IETF has considered for this, SDES is arguably the least secure. Here's how it works. Suppose Alice wants to talk to Bob, who lives in China. Alice's phone generates a random session key to encrypt the conversation, but somehow Alice has to get this key into Bob's hands so they can both use it. Her phone transmits this key via SIP to her VoIP service provider, namely her local phone company. Her phone company, who now has full knowledge of this session key, transmits it to Bob's phone company in China. Bob's phone company, owned by the Chinese government who now has full knowledge of the session key, transmits it to Bob's phone. Now Alice and Bob's phones are ready to start an encrypted voice conversation. What's wrong with this picture?
If Alice wants to talk with Bob about human rights issues, or how they might overcome trade barriers, the Chinese government can easily monitor the call. To stay competitive in a global economy, it's important that a company use end-to-end encryption to protect its business communications from foreign governments. And some of us have doubts about whether our domestic phone company will always act with our best interests in mind.
If PGP Corp implemented such an embarrassingly bad protocol, it would be met with shocked disbelief in the crypto community. But VoIP product vendors seem to get away with it, probably because crypto is not part of the VoIP industry's core competency. I've talked to VoIP vendors who just shrug and candidly admit they implemented SDES so they can simply check the "supports encryption" checkbox on their product feature checklist. Their excuse is that their customers have not demanded anything better.
Q: Why can't I just use IPsec to encrypt my VoIP calls?
A: Well, you can, but it would be a bad idea in most cases. IPsec encryption is done down in the IP layer of the Internet protocol suite's protocol stack, which is too low a layer to let the user know if it is running. Some routers support IPsec, and some don't. You don't know if the other party supports IPsec, so some connections will be not encrypted, and you would never know it. If you don't know for sure whether the call is encrypted, what good is it? It's better to do the encryption at the application layer, so that the user can be told if the call is encrypted.
Q: Why do we need ZRTP if we already have SRTP? Isn't SRTP good enough?
A: This is the wrong question to ask. Despite the similarity in the two names, it is not a choice between SRTP and ZRTP. One does not replace the other. SRTP is the protocol we use to encrypt the low level voice packets. But SRTP alone is not the whole solution. You cannot use SRTP until both parties have agreed on what key to use for the SRTP encryption. That's where ZRTP comes in. ZRTP is the protocol that the two parties use to negotiate the SRTP session key. Zfone uses SRTP, but it uses ZRTP first to negotiate the SRTP session key.
Q: I'm a VoIP developer. How can I get the ZRTP protocol implemented in my VoIP application?
A: We have a nice SDK for VoIP developers to integrate the ZRTP protocol into their software or hardware VoIP clients or servers. The software is implemented in C. Visit the Zfone libZRTP SDK page to license it, and reduce your time-to-market quite a bit. We also have comprehensive API documentation for the SDK. And to help you debug your product by using your Wireshark packet sniffer to recognize and dissect ZRTP packets, check out our nifty ZRTP packet dissector patch for Wireshark.
Q: What's the difference between ZRTP, Zfone, and the libZRTP SDK?
A: ZRTP is the protocol specification we developed that describes how to negotiate session keys for a secure VoIP phone call. We have a software development kit, a subroutine library for developers that implements the ZRTP protocol, called libZRTP, which can be integrated into VoIP applications. And we have a complete software application called Zfone that showcases the libZRTP SDK. Zfone is ready to run for end users, not just VoIP developers. Zfone is not itself a VoIP client-- it is an "adapter" that lets you make secure phone calls using your favorite VoIP client. If you are developing your own VoIP application and you want to build in ZRTP capabilities, you should get our libZRTP SDK. If you just want to make secure VoIP phone calls as an end user, get Zfone.
A: I'm a firm believer in publishing the source code for cryptographic software for peer review, to build public confidence that it contains no back doors, a tradition I started in 1991 with PGP. PGP is a proprietary product, even though the source code is available for peer review. Publishing the source code for peer review is not the same as making it availble under an open source license. I've paid for this Zfone project out of my own personal funds, and I'd like to continue paying my engineers to make improvements and make a living while I'm doing this. Because of that, I was at first reluctant to license Zfone under an open source license. But a number of people suggested I could still make money if I made at least part of Zfone available under a dual-licensing approach, with GPL licensing for inclusion in open source projects, and commercial licensing for proprietary products. Some software, such as MySQL, has taken this path. So I decided to take the plunge and dual-license the Zfone SDK. This transition to a dual-licensed GPL approach will go into effect when we finish our beta test phase, which should be Real Soon Now.
Zfone has several major components, and not all of them are published under the same licensing terms. The entire body of source code for the complete Zfone client software is published for peer review. In addition, the Zfone libZRTP SDK library is published under the GPL version 2. The libZRTP SDK may be included in any GPL project (and if you need it for other non-GPL open source applications, contact us). However, the rest of the Zfone application as a whole (as opposed to just the libZRTP SDK library) remains proprietary, and is published for public peer review, but not under an open source license. For a full discussion of this, see the Zfone Licensing Policy page.
Q: Is Zfone covered by any patents?
A: I think software patents stifle innovation and have done a great deal of harm to the software industry, especially in the crypto world. I agree with the League for Programming Freedom on this matter. The RSA patent holders wielded their patent to do all they could to destroy PGP in the 1990s. I don't ever want to experience this problem again, so I applied for a patent relating to some aspects of Zfone and plan to use it defensively against other companies who might make patent claims against us in the future. We have filed an Intellectual Property declaration with the IETF regarding any patent rights we may have in the future, and have offered a royalty-free license under most conditions, for people who don't sue us. Having the patent also allows us to better serve the user community's interests by providing leverage to get other ZRTP implementors to abide by ZRTP's back-door-resistant features. For more details, see the IPR statement we filed with the IETF. Keep in mind that the IPR statement is not the real license, which would contain the definitive details.
Q: Do you support Elliptic Curve Diffie-Hellman?
A: Why, yes, it just so happens, we do. Although ZRTP always supports classic Diffie-Hellman, if you license our libZRTP SDK, we offer optional support for the ECDH protocol as defined by NIST SP 800-56A and NSA Suite B, which uses the same elliptic curves as used by ECDSA in FIPS 186-3. Elliptic curve algorithms are the next generation of public key cryptography, offering a level of security that better matches the full key strength of AES-256. The free versions of Zfone do not support ECDH.
Q: Does Zfone work with Skype?
A: No. Skype uses a closed proprietary protocol, which they do not publish. That makes it hard to make Zfone work with it. Skype does not interoperate with the rest of the VoIP industry, which is built on open standards. I decided to follow the industry standards.
Q: Does Zfone work with plain old telephone service (POTS) phones?
A: Nope. Sorry. It only works with VoIP protocols, not PSTN, or POTS, phones. VoIP is the wave of the future, so I'm not motivated to try to retrofit this to work with the old public switched telephone network. A famous hockey player said "I try to skate to where I think the puck will be."
Q: My VoIP service provider (such as Vonage or AT&T) gave me an ATA (Analog Telephony Adapter), or VoIP router, that allows me to connect my old-fashioned telephone to my broadband connection. Will Zfone work with that?
A: Well, not with that exact setup, no. Your ATA or VoIP router is a hardware device that lets you connect your old analog telephone to a VoIP network. To make a secure call with that kind of setup, you would have to have an ATA with the ZRTP protocol integrated inside, which will happen someday, we hope. In the meantime, if you really want to run Zfone now, you need to run a software VoIP client (such as X-Lite, Gizmo, SJphone, or perhaps a software VoIP client supplied by your VoIP service provider) on your PC or Macintosh computer. You can use the software VoIP client to connect to your VoIP service provider from your computer, not from a normal telephone. Then you can install Zfone on the same computer, and have it convert your VoIP call to the ZRTP protocol. And, of course, the other party you are calling must also be running VoIP with the ZRTP protocol (such as Zfone) on the other end. This will become simpler when ATA or VoIP router vendors integrate the ZRTP protocol inside their hardware.
However, there may be a way to use an ATA even if the ATA does not yet have ZRTP integrated inside. Ripcord Networks makes a hardware "bump-in-the-wire" product called Ripcord Reserve that might work. We haven't tried it with Vonage yet, but in principle, it could be connected between the ATA and the network and encrypt the VoIP call on the fly. We'll have to see how well it does in testing for that environment. Contact Ripcord Networks if you want to ask them if it works with Vonage, or if you want to order this device.
Q: Does ZRTP work with the Asterisk PBX? What about IAX?
A: ZRTP has been successfully integrated into Asterisk, an open source PBX server from Digium. This modified version of Asterisk supports ZRTP in SIP/RTP calls, and we are working with Digium to make it available in their Asterisk products. For more information on ZRTP support for Asterisk SIP/RTP calls, see our Asterisk support page. However, ZRTP and IAX (Inter-Asterisk eXchange protocol) are not well suited for each other in their present forms. Perhaps I'll have a look at IAX more closely to see what can be done to improve its security.
Q: Can ZRTP be used with H.323 or other signaling protocols?
A: Yes, ZRTP can be used with any signaling protocol, including SIP, H.323, MGCP, Jingle, and Peer-to-Peer SIP (which I think is coolest of all). ZRTP is independent of the signaling layer, because it does all its key negotiations in the media stream. In fact, Zfone now encrypts calls made with the Google Talk VoIP client, which uses Jingle for signaling instead of SIP (but only when Google Talk is using RTP for its media stream). ZRTP is the only VoIP encryption protocol with this much flexibility.
Q: Which platforms do you support?
A: Zfone runs on Windows XP and Vista, both 32-bit and 64-bit versions. Zfone also runs on Linux and Mac OS X. Zfone will encrypt audio and video for Apple iChat calls on Mac OS X (Tiger and Leopard), but not file transfers, text chat, or remote desktop control. Zfone has been tested with these VoIP clients: X-Lite, Gizmo, XMeeting, and SJphone. It does not work with Skype.
The Zfone libZRTP SDK runs on more platforms than the Zfone application. It has been used on Windows XP, Vista, Mac OS X, Linux, Windows Mobile, and Symbian.
Q: Isn't it a protocol layer violation to do the key management in the media instead of in the signaling?
with my friend on the phone,
I shouldn't have to clear it first
with the phone company.
A: Some proponents of other VoIP encryption schemes say that it offends their sensibilities to see ZRTP negotiate the cryptographic keys in the media stream, instead of in the signaling layer, as other VoIP encryption schemes do. They call it a "layer violation". But to me (and to a number of other protocol designers I've talked to), it seems clear that the signaling should take care of its own key negotiation for signaling authentication, and the media layer should negotiate its own keys for media encryption. The two layers should each take care of their own cryptographic needs. If anything, doing the media encryption key negotiation in the signaling layer is the real layer violation.
In the same vein, I don't feel that the VoIP service providers can always be trusted to act with my interests in mind, so I don't want to involve their SIP servers in my encryption key negotiations. If I want to speak Navajo with my friend on the phone, I shouldn't have to clear it first with the phone company. It's just none of their business. And that's part of what makes ZRTP so broadly appealing.
It's also worth noting that traditional secure telephones in the PSTN world, such as the AT&T TSD-3600 or the STU-III, did all their key negotiations in the media stream. They used a modem to establish a digital channel on a normal voice grade phone line, negotiated their keys, and sent an encrypted voice stream all on the same channel. No one called it a layer violation. This is the way secure phones always worked before VoIP came along.
Q: Many VoIP clients include some form of built-in text chat or instant messaging. Does Zfone or ZRTP encrypt those text messages?
A: No. Not yet, anyway. ZRTP does very well by limiting its mission to just managing the keys and encrypting RTP media streams for VoIP. Also, these instant text messaging protocols come in a number of different variants, such as AIM or Jabber, with different VoIP clients supporting different instant messaging protocols. Each of them will require a different method of encryption, and that remains to be worked out. Some methods already exist for encrypting some forms of text chatting, such as the one offered by PGP Corp. We're looking at the most appropriate methods to add this capability to Zfone.
Q: Does Zfone have any "back doors"?
A: Anyone who knows anything about me knows the answer is No. In fact, I have a whole page on that subject regarding PGP software, and it applies equally to Zfone. Now, having said that, I remind you that Zfone is still beta software, and has a few bugs. Until I do a real release, I make no claims about it being secure. We haven't finished our internal code reviews, and we may yet discover bugs that affect security. That's also why we have a public beta. We need you to help us test the code.
It's easy to keep back doors out of your own product, as I do with Zfone. It's much harder to keep back doors out of other vendors' implementations of the ZRTP protocol. Nonetheless, I decided to at least make an attempt. Take a look at these ideas for ZRTP's back-door-resistant features.
Q: Is the Short Authentication String (SAS) vulnerable to an attacker with voice impersonation capabilities?
A: In practical terms, no. It is a mistake to think this is simply an exercise in voice impersonation (perhaps this could be called the "Rich Little" attack). Although there are digital signal processing techniques for changing a person's voice, that does not mean a man-in-the-middle attacker can safely break into a phone conversation and inject his own short authentication string (SAS) at just the right moment. He doesn't know exactly when or in what manner the users will choose to read aloud the SAS, or in what context they will bring it up or say it, or even which of the two speakers will say it, or if indeed they both will say it. In addition, some methods of rendering the SAS involve using a list of words, notably the PGP word list, in a manner analogous to how pilots use the NATO phonetic alphabet to convey information. This can make it even more complicated for the attacker, because these words can be worked into the conversation in unpredictable ways. Remember that the attacker places a very high value on not being detected, and if he makes a mistake, he doesn't get to do it over.
To further reduce the liklihood of a voice impersonation attack, we recommend that both parties should verbally repeat the SAS, if they feel that the call is likely to invite the attention of an especially resourceful opponent who is willing to take risks. We also recommend that if the user interface permits, the SAS should be rendered via the PGP word list, instead of using base-32 digits.
Some people have raised the question that even if the attacker lacks voice impersonation capabilities, it may be unsafe for people who don't know each other's voices to depend on the SAS procedure. This is not as much of a problem as it seems, because it isn't necessary that they recognize each other by their voice, it's only necessary that they detect that the voice used for the SAS procedure matches the voice in the rest of the phone conversation.
Q: Has anyone done any real security analysis on Zfone or ZRTP?
A: Yes. Andy Clark's security analysis company, Detica Forensics, did a report in January 2008, available here as a PDF file: Forensic Analysis of Zfone. We stood up pretty well in this report. Of course, this does not prove that there are no security flaws in Zfone or ZRTP. But it does help build more confidence in them, at least in the aspects of Zfone that were covered by the analysis.
Q: Does Zfone and ZRTP encrypt Touch-Tone keypad DTMF tones?
A: Yes. ZRTP encrypts all RTP traffic, including Touch-Tone keypad DTMF tones. DTMF tones are carried in the RTP media stream using methods defined by RFC 2833, embedded as special RTP payload types. We encrypt these along with the rest of the RTP media stream, which is important because people use DTMF tones to enter their credit card numbers when they call their bank, for example.
There was a problem with an old version of the Zfone beta software regarding the encryption of DTMF tones. Forbes magazine had an article on 2 August 2007 that reported a problem which they attributed to ZRTP's handling of DTMF tone encryption. In fact this was not due to any deficiencies in the ZRTP protocol, but was due to a software bug in the Zfone beta software that existed in April 2007. That bug was fixed before the article appeared. The bug only happened when Zfone was used in conjunction with SJLabs' SJphone, triggered by a subtle interaction with a bug in SJphone that was improperly generating DTMF packet sequences. Current versions of Zfone always encrypt DTMF tones from all VoIP clients, including SJphone.
There is a potential but unlikely problem with DTMF handling that has never been reported in Zfone. In unusual cases it is possible to send DTMF over the SIP channel. Some very old, non-standard SIP clients send it using a SIP INFO method - there is no RFC for this and it is discouraged strongly by the SIP standards community. There is also a new SIP extension (RFC 4730) known as KPML (Keypress Markup Language) which can be used to send DTMF over a SIP NOTIFY - but very few VoIP clients implement this yet, and the ones that do don't seem to be using it, and can be easily configured to not use it. If you ever find a VoIP client that uses KPML, we recommend that you simply disable this feature and allow DTMF to be carried the traditional way in the RTP media stream. Note that all RTP media encryption protocols, not just ZRTP, would be equally affected by this problem if SIP is used to carry DTMF.
A: Well, I'd rather not have to use the unwieldy acronym for Media Path Key Agreement for Secure RTP. It was Alan Johnston who suggested ZRTP, because it negotiates the session keys for SRTP, Secure Real-time Transport Protocol (RFC 3711). We just used a regrettably less descriptive mutation of that name, incorporating my last initial. In defense of my immodesty, it's worth noting that in the crypto community, eponymous crypto protocols are very much the norm. Examples include RSA, Diffie-Hellman, ElGamal, CAST, RC2 and RC4 (Ron's Code), Blakely-Shamir, and many others.
Alan's original rationale for the name ZRTP also was based on the fact that ZRTP was originally (in its first couple of Internet drafts) based on adding header extensions to RTP packets, so ZRTP was really a variant of RTP. We have since changed the packet format to no longer encapsulate the ZRTP packets inside RTP header extensions, but to make them into a separate packet format that is distinguishable from RTP. In view of that change, ZRTP is now a pseudo-acronym.
Q: Why do I have to register in order to download Zfone?
A: Although the US has ended most of its export controls for crypto software, there are still some reasonable residual export controls in place, namely, to prevent the software from being exported to a few embargoed nations, such as North Korea, Iran, Libya, Syria, and Sudan. And for commercial encryption software that you actually pay for (which does not include this free public beta), there are now requirements to check customers against government watch lists as well, which is something that companies such as PGP comply with these days. PGP Corp volunteered to host the public beta software on their server, with all the appropriate checks in place.
The Zfone registration page checks your IP address against the list of embargoed countries, then emails you a link that you must click on to start your download, and checks your IP address again when you follow that link, which presumably means you did not receive your email in an embargoed country, and that the download itself did not go to an embargoed country. The U.S. Government deems this as adequate evidence that we made our best efforts to comply with U.S. export laws. Staying out of that kind of trouble is important to me. Been there. Done that.
Q: If I integrate your SDK into my VoIP application, will I have to worry about U.S. export controls?
A: Well, yes, but it's pretty easy to deal with these days. If you plan to export your product from the U.S., you will have to file some papers with the U.S. Commerce Department. I did that for Zfone, and you can see the results of that here. I recommend you use Roszel Thomsen, at the law firm of Thomsen and Burke LLP.
Q: Does Zfone protect against "social network analysis" and other forms of analysis based on traffic patterns?
A: No, not at all. Zfone just encrypts the contents of the call. The only way to protect against traffic analysis is to go through multiple intermediaries, which is a technique that has been used to protect email and web browsing (see the TOR project for an example of this). But this adds latency to communications, which may be unnoticeable for email, and at least tolerable for web browsing, but would be unacceptable for phone calls. Further, these countermeasures may be ineffective against a clever and resourceful opponent, because it's hard to hide the timing and length of the messages, especially if there are real-time communication requirements.
Q: Why does Zfone show the IDLE status during some VoIP calls?
A: Some VoIP clients attempt to traverse NAT routers by sending RTP voice and video packets through TCP instead of UDP. This protocol tunneling violates the IETF standards for VoIP, which require that RTP media packets be sent over UDP. Zfone assumes that RTP will be found only in UDP packets, and thus will not detect RTP sent through TCP. In that case, Zfone's GUI displays the "Idle" status during a call, and does not engage the ZRTP protocol. Sometimes the packets are going through a media relay which converts them to UDP for the other party, whose Zfone client can therefore see the media stream, but searches in vain for the idled ZRTP peer and displays the "NOT Secure / No ZRTP Peer" status.
If this happens, here are a couple of workarounds: 1) The best solution is to move one of the parties' computers (in particular, the one that displays IDLE) off their local network to an external IP address, thereby simplifying the NAT traversal problem. Even better, move both computers to external IP addresses. 2) Or it might help to switch one of the parties (especially the IDLE one) to a different VoIP client. Often the VoIP client software decides to straighten up and follow the standards when talking to a VoIP client from another vendor.
Any form of protocol tunneling will subvert Zfone's RTP detection mechanism. In fact, most protocol tunneling is done to defeat various packet filtering mechanisms, such as firewalls. This does not indicate a problem with the ZRTP protocol. It's related to trying to run the ZRTP protocol as a packet filter in the IP stack, as Zfone does. It's a problem that would go away completely if the ZRTP protocol were integrated inside a VoIP client, for example by using our Zfone SDK.
Q: How does ZRTP verify the identity of who you call?
A: It doesn't. It doesn't even try. It's not necessary to verify the identity of the other party to establish a secure call. In a normal PSTN phone call, what happens if you call someone's number, and his wife answers the phone? Do you sound a klaxon horn and blow a fuse? No. You use your brain to figure it out. That's just how phones work, and it's no big deal. It's certainly no reason to fail to make a secure connection. The most important wiretapping vulnerability is a Man-in-the-Middle (MiTM) attack, which ZRTP guards against by using either a short authentication string, or key continuity, or both.
Of course, it helps if you know the identity of the caller before you answer the phone, like the Caller ID in the PSTN. The SIP protocol attempts to address that problem in the signaling. It is a different problem, and certainly worthy of attention, but it is not the job of ZRTP. ZRTP does not begin until after the user answers the phone and the call is underway. ZRTP merely establishes a secure wiretap-resistant connection to another ZRTP endpoint, and does it very well by narrowing the scope of its mission.
I don't know why so many people get hung up on this question of making an "authenticated phone call". It's a hard problem, and not worth the effort, in my opinion. Most phones, both at home and at work, are used by more than one person. And many people use other people's phones on a regular basis. There is not a one-to-one relationship between people and phones. Then there is the problem of establishing a digital identity. We could have a complex bureaucracy create a public key infrastructure that issues a certificate that we can attach to my phone, which can be displayed by your phone. Not only is that of questionable value in my opinion, but it's also hard. A number of clever people, including my friend Carl Ellison, have written about the complexity of creating unambiguous unique names and attaching them to people.
It's a mistake to view the world through a radar screen-- you must also use your eyes. And your common sense. The ZRTP protocol cannot tell you the name on the birth certificate of the person you are talking to. Or that the person you are talking to is telling the truth. And it cannot tell you if the other "endpoint" is then forwarding the call to another device. But neither can anything else.