RTF, DOC, and HTML - Part 5 - RTF to DOC

RTF, DOC, and HTML - Part 5 - RTF to DOC

In the previous sections, we have learned together about RTF and how to use Docx2Rtf. In this section, we will explore the remaining aspects of RTF and how to convert from RTF to DOC format.

The essential components of RTF

  • 7-bit ASCF RTF files are easy to transfer between operating systems, including control words, control symbols, and groups.

From control:

  • An RTF control word is a special format control that we will use to mark characters displayed on the screen or characters for the printer. A control word no longer than 32 characters. A control word is identified by:
        \<ASCII string> <Delemiter> 
  • <Delemiter> marks the end of the name from the control (for example, \par)
  • Delimiter: A delimiter usually is a character, such as an empty character, a tab mark, or a comma. It marks the end of a paragraph in command and starts another one. We may also use separators to separate data into fields and records when you want to export or import data in a database format.) The <ASCII string>: is made up of ASCII letter characters (a to z and A to Z). The first control word does not contain any uppercase characters; however, in recent years, uppercase characters appear in some newer control words.

Control symbol

  • The control symbol consists of a backslash followed by a single character, not alphabetically. For example: \~ (backslash). Control characters have no delimiters. For example, we will treat a space after a control character as text, not a delimiter.

Group

  • A group may include text, control words, control characters enclosed in parentheses {}. The opening parenthesis { indicates the start of the group, and the closing parenthesis } indicates the end of the group. Each group specifies the documents affected by the different groups and attributes of that text. The RTF file includes groups of fonts, styles, colors, images, comments, comments (comments), headers and footers, summary information, fields, markup, and characters. Format attributes, maths, images, etc. If the font type and size, file, text style, color, correction, group information summary, and document properties, the format is included in the file, The message must appear in the RTF heading. If we don't use the content of any group, then we can ignore the group. Any group that uses the attributes defined in another group must appear after the group that set the previous attribute. For example, color and font attributes must precede the style group.

Structure of RTF file

  • An RTF file has the following syntax:
    <File> '{' <header> <document> '}'
  • This syntax is standard RTF syntax; Any RTF reader must be able to interpret this RTF syntax text correctly. The RTF reader does not need to explain all control words, but must be able to omit unknown (or unused) control words.

Header

  • The header has the following syntax:
    <header>
    \rtf1 \fbidis? <character set> <from>? <deffont> <deflang> <fonttbl>?<tbl>? <colortbl>? <stylesheet>? <stylerestrictions>? <listtables>?<revtbl>? <rsidtable>? <mathprops>? <generator>?
  • Each element in the header syntax will appear in the order listed above. The document's properties can be placed before or in the header. We must define each attribute before referenced. For example:
  • We must determine the style sheet (style sheet) before use.
  • Define the font table (including font styles and sizes) is also a must before reference.

Document

  • Once the RTF header has been defined, the RTF reader has enough information to correctly read the actual document documents. The <document> contains document information followed by one or more parts, with the following syntax:
    <document> <info>? <xmlnstbl>? <docfmt> * <section> +
  • <info>: "information group": control word \info introduces group information, which contains information about the document including: title, author, subject, keyword, other specific information able for file.
  • <xmlnstbl>: "XML Namespace Table": contains namespaces for XML and SmartTags used in RTF document format. For example :
        {\*\xmlnstbl{\xmlns1 {HYPERLINK "https://www.officecomponent.com/demos/office-word/"}}}
  • <docfmt> "Document Formatting Properties": After defining the group information and XML namespace table, there may be some words that control the document format (referred to as <docfmt>). These control words define document properties such as annotation position, alignment position. These attributes must be predefined in the document's plain text.
  • <Section>: for each text segment, the syntax is as follows:
        <section>         <secfmt>* <hdrftr>? <para>+ (\sect <section>)?
  • The control word that appears at the beginning of each segment, including the format <secfmt>, the title and the ending <hdrftr>, the text <para> in the form of unstructured text (Plain text) or tables (table).

When starting an RTF file, the processor views the contents of that file and automatically separates the parts and ignores the parts that do not understand. Besides, the RTF structure also has other components needed for the user to complete specific tasks. These new sections affect the whole document. Unlike many text formats for word-processing programs, RTF codes can be easily read and understood even without specialized software (just plain text reading software). The default font and margin values ​​may vary between software between versions. The format retains high compatibility. Unlike forms such as Microsoft Word DOC, Office Open XML, or OpenDocument, RTF does not support macros, and therefore contains few viruses in macro form. The filename ending with *.rtf does not mean that it is always in RTF format; To make sure a file is in RTF format, without running a macro, open the content in plain text reader.

Uses

  • Most text editors support RTF (in some versions of RTF). This makes RTF a "common" format for many drafting software running on different operating systems. However, the compatibility depends in part on the version of RTF used. Most RTF reader software will ignore RTF characters that it does not understand. RTF format is used in many editors: WordPad, MacWrite, WriteNow, TextEdit, AbiWord, OpenOffice.org, KWord, or Bean, ... The RTF format is supported in many distributions of Microsoft Windows, Mac OS X, or rtf2xml open source computer programs.

An Amazing RTF to DOC Converter for .NET Application for .NET

  • Introducing Word Office Component - A third library that allows you to integrate the ability to convert from RTF format to DOC format with just a few lines of code.
  • To do this, first, make sure you have installed and imported the Word Office Component library:
  • C# Version: using OfficeComponent.Word;
  • VB Version: Imports OfficeComponent.Word
  • Next, you need to create yourself a Word document. You can easily archive this task through the WordDocument class, which is already fully installed in the Word Office Component library.
  • C# Version: WordDocument document = new WordDocument();
  • VB Version: Dim document As New WordDocument()
  • We will use this document variable to store the output of the program, i.e., the RTF file after conversion into DOC.
  • After that, you need to locate and open the RTF file with the Open() method for WordDocument class objects. Note that specify the input parameter that determines the RTF format, in this case, WordDocumentFormat.Rtf.
  • C# Version: document.Open(FilePath, WordDocumentFormat.Rtf);
  • VB Version: document.Open(FilePath, WordDocumentFormat.Rtf)
  • In this step, because we specified the input parameter in RTF format, the Open() method will support conversion (cast) from RTF format to DOC format.
  • And voila, you've got yourself a DOC file converted from RTF format.
  • We may archive this DOC document for use in subsequent revisions. The way to return the path to the document store will be described in more detail in our Word package.

The full source code of this example is also available in our Word package.

A live demo for RTF to DOC is also available on our site. If you also need Word functionality, check out our Word online demos.

45-Day Money Back Guarantee

We will refund your full money in 45 days
if you are not satisfied with our products

Buy Now
You have successfully subcribed to our mailing list.
Dont miss out Get update on new articles and other opportunities Subscribe