The Open eBook Publication Structure. Previous versions of this specification ( OPS) and its related specification, OPF, were unified into the single OEBPS. Open eBook (or OEB), or formally, the Open eBook Publication Structure (OEBPS ), is a legacy e-book format which has been superseded by the EPUB format. Open eBook Publication Structure (OEBPS) is an XML-based specification for the content, structure, and presentation of electronic books.
|Language:||English, Spanish, Dutch|
|Genre:||Business & Career|
|Distribution:||Free* [*Sign up for free]|
Developed to provide a specification for representing the content of electronic books, the Open eBook specification standard, comprising of Publication Structure. SoftBook Press, NuvoMedia, Microsoft.. OEBF Formed.. Open eBook Publication Structure (OEBPS).. OEBPS Update. An EPUB file is a digital ebook saved in the EPUB format, an open XML-based The Open Publication Structure (OPS), which defines the content markup; The.
It identifies all other files in the publication and provides descriptive information about them. The media type defined here is needed to correctly identify OPF files when they are edited on disk, referenced in OEBPS Container files, or used in other places where media types are used. Clearly it is possible to author malicious files which, for example, contain malformed data.
Most XML parsers protect themselves from such attacks by rigoriously enforcing conformance. All processors that read OPF files should rigourously check the size and validity of data retrieved. The registration uses the template present in [ RFC ]. Please remove this section before publication as RFC. References 6.
IPR Notice The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights.
The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- ipr ietf. This Internet-Draft expires in August This document is subject to the rights, licenses and restrictions contained in BCP 78 , and except as set forth therein, the authors retain all their rights.
However, experienced XML developers should be able to adapt the examples to any programming language with XML libraries. What is EPUB? You can read the EPUB format using a variety of open source and commercial software on all major operating systems, e-ink devices such as the Sony PRS, and small devices such as the Apple iPhone. Who is producing EPUB? Is it only for books? Although traditional print publishers were the first to adopt EPUB, nothing in the format limits its use to eBooks.
PDF is still the most widely used electronic document format in the world. PDF readers are ubiquitous and installed on most modern computers. Specific fonts can be embedded in PDF to control the final output exactly. From a software developer's point of view, PDF falls far short of the ideal: It's not a trivial standard to learn; therefore, it's not a simple matter to throw together your own PDF-generating code. Although PDF libraries are available for most programming languages, many are commercial or are embedded in GUI applications and not easily controlled by external processes.
Not all free libraries continue to be actively maintained. PDF-native text can be extracted and searched programmatically, but few PDFs are tagged such that conversion to a Web-friendly format is simple or reliable.
PDF documents aren't easily reflowable, meaning that they don't adapt well to small screens or to radical changes to their layouts. An alternative format is DTBook, a standard for encoding books for the visually impaired.
See Related topics for more information on DTBook, which is not covered in this tutorial. The specification can be quite strict about the format, contents, and location of those files within the EPUB archive. This section explains what you must know when you work with the EPUB standard. Listing 1.
Open a text editor or an IDE such as Eclipse. I recommend using an editor that has an XML mode—in particular, one that can validate against the Relax NG schemas listed in Related topics. The mimetype file This one's pretty easy: The mimetype file is required and must be named mimetype.
Additionally, the mimetype file must be the first file in the ZIP archive and must not itself be compressed. For now, just create this file and save it, making sure that it's at the root level of your EPUB project. EPUB reading systems will look for this file first, as it points to the location of the metadata for the digital book. Inside it, open a new file called container.
The container file is very small, but its structural requirements are strict. Listing 2. Sample container. These topics are not covered in this tutorial.
See the OCF specification for more information. The mimetype and container files are the only two whose location in the EPUB archive are strictly controlled.
As recommended although not required , store the remaining files in the EPUB in a sub-directory.
The following section of this tutorial covers the files that go into OEBPS—the real meat of the digital book: its metadata and its pages. Open Packaging Format metadata file Although this file can be named anything, the OPF file is conventionally called content. It specifies the location of all the content of the book, from its text to other media such as images.
Listing 3. Metadata Dublin Core defines a set of common metadata terms that you can use to describe a wide variety of digital materials; it's not part of the EPUB specification itself. Any of these terms are allowed in the OPF metadata section. When you build an EPUB for distribution, include as much detail as you can here, although the extract provided in Listing 4 is sufficient to start.
Listing 4. Extract of OPF metadata The two required terms are title and identifier. According to the EPUB specification, the identifier must be a unique value, although it's up to the digital book creator to define that unique value. Note that the value of the attribute unique-identifier must match the ID attribute of the dc:identifier element.
Other metadata to consider adding, if it's relevant to your content, include: Language as dc:language. Publication date as dc:date. Publisher as dc:publisher. This can be your company or individual name. Copyright information as dc:rights. Including a meta element with the name attribute containing cover is not part of the EPUB specification directly, but is a recommended way to make cover pages and images more portable.
This example shows both forms. The value of the meta element's content attribute should be the ID of the book's cover image in the manifest, which is the next part of the OPF file. This usually means a list of XHTML files that make up the text of the eBook plus some number of related media such as images.
Every file that goes into your digital book must be listed in the manifest. Listing 5 shows the extracted manifest section. Listing 5. Extract of OPF manifest You must include the first item, toc. You can include non-supported file types if you provide a fall-back to a core type. See the OPF specification for more information on fall-back items. This is easy to confuse with the reference to the OPF file in the container. Spine Although the manifest tells the EPUB reader which files are part of the archive, the spine indicates the order in which they appear, or—in EPUB terms—the linear reading order of the digital book.
One way to think of the OPF spine is that it defines the order of the "pages" of the book. The spine is read in document order, from top to bottom. Listing 6 shows an extract from the OPF file. Listing 6. Extract of OPF spine Each itemref element has a required attribute idref, which must match one of the IDs in the manifest.
The toc attribute is also required. The linear attribute in the spine indicates whether the item is considered part of the linear reading order versus being extraneous front- or end-matter. Guide The last part of the OPF content file is the guide. This section is optional but recommended. Listing 7 shows an extract from a guide file. Listing 7. Extract of an OPF guide The guide is a way of providing semantic information to an EPUB reading system.
While the manifest defines the physical resources in the EPUB and the spine provides information about their order, the guide explains what the sections mean. Here's a partial list of the values that are allowed in the OPF guide: cover: The book cover title-page: A page with author and publisher information toc: The table of contents For a complete list, see the OPF 2.
This is rarely a problem when you generate EPUBs programmatically, where the same code can output to two different files. Take care to put the same information in both places, as different EPUB readers might use the values from one or the other. Although the OCF file is defined as part of EPUB itself, the last major metadata file is borrowed from a different digital book standard.
DAISY is a consortium that develops data formats for readers who are unable to use traditional books, often because of visual impairments or the inability to manipulate printed works.
The NCX defines the table of contents of the digital book. In complex books, it is typically hierarchical, containing nested parts, chapters, and sections.