The time between 2007 and 2010 represents a market transition in the industry where sales of IP-based components began out-selling analog-based systems. While analog systems have a cost advantage in small deployments (sixteen cameras or less), when larger number of cameras are deployed, IP-based systems may be more cost-effective initially and have a lower ongoing total cost of ownership. IP-based video surveillance systems, especially the end-node (the IP camera), have several operational and technological advantages. Why implement IP video surveillance over analog-based systems? The following subsections provide the answer.
Leveraging VoIP Adoption
Many of the advantages of implementing IP video surveillance are similar to those of VoIP adoption. The fundamental reason is the cost savings of using the IP network for both voice and data. By adding the transport of video surveillance on the existing highly-available IP network, the cost savings realized from eliminating the separate cable plant for voice extends as well to the elimination of the separate cable plant for video.
Not only the wiring for media transport can be eliminated, but also the cabling for electrical power. As is the case with VoIP in the enterprise space, where the IP phone uses PoE, so does many fixed installation IP cameras. While power to some camera deployments continue to be a requirement (Pan-Tilt-Zoom housings, wireless cameras and cameras that require fibre connectivity due to distance), PoE is a substantial cost savings.
IP video surveillance cameras, once connected to the network, may be remotely configured and managed from a central command center. The installing technician must have a laptop to focus the lens and adjust the viewpoint of the camera, but following this initial installation, the camera configuration may be completed by a technician in a central, rather than local, facility.
Access Video Any Time, Any Place
With IP-based systems, video feeds are encoded into Motion JPEG or MPEG-4/H.264 formats and stored as a digital image on a computer disk array. This provides the ability to access the video, by way of the networked digital video recorder, through the IP network at any time, from any place. These digital images do not degrade in quality from duplication like analog recordings on magnetic tape. They can be replicated and posted on web servers, distributed to law enforcement as E-mail attachments, and sent to news outlets. When analog-based systems were the norm, loss prevention/investigations staff may have to visit the location of the incident to view the video or a tape or DVD would need to be shipped by overnight courier. These inefficiencies no longer exist with IP-based systems and WAN connectivity to the physical location.
Intelligence at the Camera
With IP cameras, local processing of the video image may be done during capture and analysis like motion detection and tampering detection logic may raise alerts by communicating with a central server. The alert may use a variety of IP protocols, SMTP (E-mail), Syslog, File Transfer (FTP), or a TCP socket connection with a small keyword in the payload. The Cisco 4500 IP Cameras have an additional DSP capabilities specifically designed to support real-time video analytics on the camera. This option is to allow analytic vendors to develop firmware in the future to run on these resources.
Barriers to Success
While the advantages of an IP-based system are considerable, there are some barriers to success. They mainly revolve around the human element—job responsibilities, training, and education. Typically, the physical security manager and the network manager have no overlapping job responsibilities and therefore have little need to interact with each other. The physical security manager has job responsibilities targeted at loss prevention, employee and customer/visitor safety, security and crime prevention. Because of this, the physical security manager is more confident with a dedicated, reliable, physically separate cable plant.
Many installations of physical security cameras and the accompanying components are solely or partially implemented by value added resellers (VARs) who are specialists in their field, but not yet experts in IP networking. The VAR must become more fluent in internetworking and the network manager must understand the requirements of the physical security processes and applications.
The key elements of video surveillance is the three Rs: resolution, retention, and reliability. For an IP video surveillance deployment to be a success on the IP network, the reliability element must have careful attention by the network manager for the physical security manager to be successful.
Resolution, one of the three Rs of video surveillance, directly influences the amount of bandwidth consumed by the video surveillance traffic. Image quality (a function of the resolution) and frame rate are functions of the amount of bandwidth required. As image quality and frame rate increase, so does bandwidth requirements.
Analog Video Resolutions
Video surveillance solutions use a set of standard resolutions. National Television System Committee (NTSC) and Phase Alternating Line (PAL) are the two prevalent analog video standards. PAL is used mostly in Europe, China, and Australia and specifies 625 lines per-frame with a 50-Hz refresh rate. NTSC is used mostly in the United States, Canada, and portions of South America and specifies 525 lines per-frame with a 59.94-Hz refresh rate.
Digital Video Surveillance Resolutions (in pixels)
While image quality is influenced by the resolution configured on the camera, the quality of the lens, sharpness of focus, and lighting conditions also come into play. For example, harshly lighted areas may not offer a well-defined image, even if the resolution is very high. Bright areas may be washed out and shadows may offer little detail. Cameras that offer wide dynamic range processing, an algorithm that samples the image several times with differing exposure settings and provides more detail to the very bright and dark areas, can offer a more detailed image.
As a best practice, do not assume the camera resolution is everything in regards to image quality. For a camera to operate in a day-night environment, (the absence of light is zero lux), the night mode must be sensitive to the infrared spectrum. It is highly recommended to conduct tests or pilot installations before buying large quantities of any model of camera.
Video Compression CODECS
The Cisco Video Surveillance Media Server supports IP endpoints that use Motion JPEG (MJPEG) or MPEG-4 codec technology. Both types of codecs have advantages and disadvantages when implemented in a video surveillance system. A system administrator may choose to use MJPEG on certain cameras and MPEG-4 or H.264 on others, depending on system goals and requirements.
A codec is a device or program that performs encoding and decoding on a digital video stream. In IP networking, the term frame refers to a single unit of traffic across an Ethernet or other Layer-2 network. In this guide, frame primarily refers to one image within a video stream. A video frame can consist of multiple IP packets or Ethernet frames.
A video stream is fundamentally a sequence of still images. In a video stream with fewer images per second, or a lower frame rate, motion is normally perceived as choppy or broken. At higher frame rates up to 30 frames per second, the video motion appears smoother; however, 15 frames per second video may be adequate for viewing and recording purposes.
Some of the most common digital video formats include the following:
•Motion JPEG (MJPEG) is a format consisting of a sequence of compressed Joint Photographic Experts Group (JPEG) images. These images only benefit from spatial compression within the frame; there is no temporal compression leveraging change between frames. For this reason, the level of compression reached cannot compare to codecs that use a predictive frame approach.
•MPEG-1 and MPEG-2 formats are Discrete Cosine Transform-based with predictive frames and scalar quantization for additional compression. They are widely implemented, and MPEG-2 is still in common use on DVD and in most digital video broadcasting systems. Both formats consume a higher level of bandwidth for a comparable quality level than MPEG-4. These formats are not typically used in IP video surveillance camera deployments.
•MPEG-4 introduced object-based encoding, which handles motion prediction by defining objects within the field of view. MPEG-4 offers an excellent quality level relative to network bandwidth and storage requirements. MPEG-4 is commonly deployed in IP video surveillance but will be replaced by H.264 as it becomes available. MPEG-4 may continue to be used for standard definition cameras.
•H.264 is a technically equivalent standard to MPEG-4 part 10, and is also referred to as Advanced Video Codec (AVC). This emerging new standard offers the potential for greater compression and higher quality than existing compression technologies. It is estimated that the bandwidth savings when using H.264 is at least 25 percent over the same configuration with MPEG-4. The bandwidth savings associated with H.264 is important for high definition and megapixel camera deployments.
An MJPEG codec transmits video as a sequence of Joint Photographic Experts Group (JPEG) encoded images. Each image stands alone without the use of any predictive compression between frames. MJPEG is less computation-intensive than predictive codecs such as MPEG-4, so can be implemented with good performance on less expensive hardware. MJPEG can easily be recorded at a reduced frame rate by only sampling a subset of a live stream. For example, storing every third frame of a 30-frame per second video stream will result in a recorded archive at 10 frames per second.
MJPEG has a relatively high bandwidth requirement compared to MPEG-4. A 640×480 VGA resolution stream running at 30 frames per second can easily consume 5 to 10 Mbps. The bandwidth required is a function of the complexity of the image, in conjunction with tuning parameters that control the level of compression. Higher levels of compression reduce the bandwidth requirement but also reduce the quality of the decoded image. Since there is no predictive encoding between frames, the amount of motion or change in the image over time has no impact on bandwidth consumption.
An MPEG-4 codec uses prediction algorithms to achieve higher levels of compression than MJPEG while preserving image quality. Periodic video frames called I-frames are transmitted as complete, standalone JPEG images similar to an MJPEG frame and are used as a reference point for the predictive frames. The remaining video frames (P-frames) contain only information that has changed since the previous frame.
To achieve compression, MPEG-4 relies on the following types of video frames:
•I-frames (intraframes, independently decodable)—These frames are also referred to as key frames and contain all of the data that is required to display an image in a single frame.
•P-frames (predictive or predicted frames)—This frame type contains only image data that has changed from the previous frame.
•B-frames (bi-directional predictive frames)—This frame type can reference data from both preceding frames and future frames. Referencing of future frames requires frame reordering within the codec.
The use of P-frames and B-frames within a video stream can drastically reduce the consumption of bandwidth compared to sending full image information in each frame. However, the resulting variance of the video frames’ size contributes to the fluctuation in the bandwidth that a given stream uses. This is the nature of most codecs because the amount of compression that can be achieved varies greatly with the nature of the video source.
The Cisco Video Surveillance Manager solution supports the configuration of PTZ cameras connected to encoders or as IP cameras. In order to support PTZ connectivity, the encoder should be able to connect to the camera through a serial interface. The Video Surveillance Manager solution supports the following PTZ protocols: