[gs-cvs] rev 7095 - in trunk/gs: Resource/CMap doc lib src

leonardo at ghostscript.com leonardo at ghostscript.com
Tue Oct 10 03:03:38 PDT 2006


Author: leonardo
Date: 2006-10-10 03:03:37 -0700 (Tue, 10 Oct 2006)
New Revision: 7095

Added:
   trunk/gs/Resource/CMap/Identity-UTF16-H
   trunk/gs/Resource/CMap/Identity-UTF16-V
Modified:
   trunk/gs/doc/Use.htm
   trunk/gs/lib/gs_ciddc.ps
   trunk/gs/lib/gs_fntem.ps
   trunk/gs/src/gdevpdtc.c
   trunk/gs/src/gsfcmap.c
   trunk/gs/src/zfcmap.c
Log:
Provide a handling of true Unicode fonts.

DETAILS :

Bug 688897 "Unicode fonts in Postscript" for the customer #130.

1. Define a new Ordering = Unicode for handling Unicode character sets.

2. Define a new Registry = Artifex for distinguishing those
   character sets from other Postscript vendors, 
   which do not define such character sets yet.

3. Define a new Identity-UTF16-H CMap for identity mapping
   from Unicode UTF16 to CIDs with the Unicode character set.

4. As a part of CIF font emulation with True Types,
   add the Unicode.Unicode entry to the .CMapChooser table,
   which actually is an identity translation (gs_ciddc.ps).

6. Extend the gs/lib/cidfmap syntax with handling
   Registry = Artifex (gs_fntem.ps).

7. Force embedding Artifex CMap into PDF (gsfcmap.c).

8. Account the knowledge that an Artifex Unicode CMap
   may be used for generating a ToUnicode CMap (zfcmap.c).

9. Document all that in gs/doc .

10. Bug: range.size was not initialized when creating
   an identity ToUnicode CMap (gsfcmap.c).

A PDF generated with this patch appears
not searchable with Adobe Acrobat,
and it cannot extract the text properly.
Not sure why. Likely it's another Adobe bug.

EXPECTED DIFFERENCES :

None.


Added: trunk/gs/Resource/CMap/Identity-UTF16-H
===================================================================
--- trunk/gs/Resource/CMap/Identity-UTF16-H	2006-10-06 18:53:35 UTC (rev 7094)
+++ trunk/gs/Resource/CMap/Identity-UTF16-H	2006-10-10 10:03:37 UTC (rev 7095)
@@ -0,0 +1,54 @@
+% Copyright (C) 2003 Artifex Software.  All rights reserved.
+%
+% This software is provided AS-IS with no warranty, either express or
+% implied.
+%
+% This software is distributed under license and may not be copied,
+% modified or distributed except as expressly authorized under the terms
+% of the license contained in the file LICENSE in this distribution.
+%
+% For more information about licensing, please refer to
+% http://www.ghostscript.com/licensing/. For information on
+% commercial licensing, go to http://www.artifex.com/licensing/ or
+% contact Artifex Software, Inc., 101 Lucas Valley Road #110,
+% San Rafael, CA  94903, U.S.A., +1(415)492-9861.
+%
+% $Id$
+% Identity-UTF16-H CMap
+% An identity mapping of UTF16 codes to CIDs.
+
+/CIDInit /ProcSet findresource begin
+
+12 dict begin
+
+begincmap
+
+/CIDSystemInfo 3 dict dup begin
+  /Registry (Artifex) def
+  /Ordering (Unicode) def
+  /Supplement 0 def
+end def
+
+/CMapName /Identity-UTF16-H def
+/CMapVersion 1.000 def
+/CMapType 1 def
+
+/UIDOffset 0 def
+% No XUID yet obtained.
+
+/WMode 0 def
+
+1 begincodespacerange
+  <0000>   <FFFF>
+endcodespacerange
+
+1 begincidrange
+<0000> <FFFF> 0
+endcidrange
+endcmap
+CMapName currentdict /CMap defineresource pop
+end
+end
+
+%%EndResource
+%%EOF


Property changes on: trunk/gs/Resource/CMap/Identity-UTF16-H
___________________________________________________________________
Name: svn:keywords
   + Id
Name: svn:eol-style
   + native

Added: trunk/gs/Resource/CMap/Identity-UTF16-V
===================================================================
--- trunk/gs/Resource/CMap/Identity-UTF16-V	2006-10-06 18:53:35 UTC (rev 7094)
+++ trunk/gs/Resource/CMap/Identity-UTF16-V	2006-10-10 10:03:37 UTC (rev 7095)
@@ -0,0 +1,54 @@
+% Copyright (C) 2003 Artifex Software.  All rights reserved.
+%
+% This software is provided AS-IS with no warranty, either express or
+% implied.
+%
+% This software is distributed under license and may not be copied,
+% modified or distributed except as expressly authorized under the terms
+% of the license contained in the file LICENSE in this distribution.
+%
+% For more information about licensing, please refer to
+% http://www.ghostscript.com/licensing/. For information on
+% commercial licensing, go to http://www.artifex.com/licensing/ or
+% contact Artifex Software, Inc., 101 Lucas Valley Road #110,
+% San Rafael, CA  94903, U.S.A., +1(415)492-9861.
+%
+% $Id$
+% Identity-UTF16-H CMap
+% An identity mapping of UTF16 codes to CIDs.
+
+/CIDInit /ProcSet findresource begin
+
+12 dict begin
+
+begincmap
+
+/CIDSystemInfo 3 dict dup begin
+  /Registry (Artifex) def
+  /Ordering (Unicode) def
+  /Supplement 0 def
+end def
+
+/CMapName /Identity-UTF16-V def
+/CMapVersion 1.000 def
+/CMapType 1 def
+
+/UIDOffset 0 def
+% No XUID yet obtained.
+
+/WMode 1 def
+
+1 begincodespacerange
+  <0000>   <FFFF>
+endcodespacerange
+
+1 begincidrange
+<0000> <FFFF> 0
+endcidrange
+endcmap
+CMapName currentdict /CMap defineresource pop
+end
+end
+
+%%EndResource
+%%EOF


Property changes on: trunk/gs/Resource/CMap/Identity-UTF16-V
___________________________________________________________________
Name: svn:keywords
   + Id
Name: svn:eol-style
   + native

Modified: trunk/gs/doc/Use.htm
===================================================================
--- trunk/gs/doc/Use.htm	2006-10-06 18:53:35 UTC (rev 7094)
+++ trunk/gs/doc/Use.htm	2006-10-10 10:03:37 UTC (rev 7095)
@@ -52,6 +52,7 @@
 <li><a href="#Font_lookup">Font lookup</a>
 <li><a href="#CIDFonts">CID fonts</a>
 <li><a href="#CIDFontSubstitution">CID font substitution</a>
+<li><a href="#UnicodeTT">Using Unicode True Type fonts</a>
 <li><a href="#Temp_files">Temporary files</a>
 </ul>
 
@@ -1218,10 +1219,17 @@
             The first font in a collection is 0.
             Default value is 0.
 <tr valign="top">       <td><b><tt>/CSI</tt></b>
-        <td>array of 2 elements
+        <td>array of 2 or 3 elements
         <td>(required) Information for building <b><tt>CIDSystemInfo</tt></b>.
-            The first element is a string, which specifies <b><tt>Ordering</tt></b>.
-            The second element is a number, which specifies <b><tt>Supplement</tt></b>.
+            <p>
+            If the array consists of 2 elements,
+            the first element is a string, which specifies <b><tt>Ordering</tt></b>;
+            the second element is a number, which specifies <b><tt>Supplement</tt></b>.
+            <p>
+            If the array consists of 3 elements,
+            the first element is a string, which specifies <b><tt>Registry</tt></b>;
+            the second element is a string, which specifies <b><tt>Ordering</tt></b>;
+            the third element is a number, which specifies <b><tt>Supplement</tt></b>.
 </table>
 
 <p>
@@ -1298,6 +1306,47 @@
 you need to create a suitable <b><tt>lib/cidfmap</tt></b> by hand,
 possibly a specific one for each document.
 
+<h3><a name="UnicodeTT"></a>Using Unicode True Type fonts</h3>
+
+Ghostscript can handle True Type fonts with the full Unicode UTF-16 character set.
+For doing that, a 3d party software should generate a Postscript
+or PDF document with a text, which is encoded with the 
+UTF16 encoding. Ghostscript may be used for converting
+such Postscript documents to PDF and for 
+re-distilling such PDF documents to PDF subsets.
+
+<p>
+To render an UTF-16 encoded text, one must do the following :
+
+<ul>
+
+<li> 
+Provide a True Type font with Unicode Encoding.
+It must have a <b><tt>cmap</tt></b> table with 
+<b><tt>platformID</tt></b> equals to 3 (Windows), 
+and <b><tt>SpecificID</tt></b> eqials to 1 (Unicode).
+
+<li> 
+Describe the font in <b><tt>lib/cidfmap</tt></b>
+with special values for the <b><tt>CSI</tt></b> key :
+<b><tt>[/Artifex /Unicode 0]</tt></b>.
+
+<li> 
+In the PS or PDF document combine the font 
+with one of CMap <b><tt>Identity-UTF16-V</tt></b>
+(for the horizontal writing mode)
+or <b><tt>Identity-UTF16-V</tt></b> 
+(for the vertical writing mode).
+Those CMaps are distributed with Ghostscript
+in <b><tt>Resource/CMap</tt></b>.
+
+</ul>
+
+Please note that <b><tt>/Registry (Adobe) /Ordering (Identity)</tt></b>
+won't properly work for Unicode documents,
+especially for the searchability feature
+(see <a href="#CIDFontSubstitution">CID font substitution</a>).
+
 <h3><a name="Temp_files"></a>Temporary files</h3>
 
 <blockquote><table cellpadding=0 cellspacing=0>

Modified: trunk/gs/lib/gs_ciddc.ps
===================================================================
--- trunk/gs/lib/gs_ciddc.ps	2006-10-06 18:53:35 UTC (rev 7094)
+++ trunk/gs/lib/gs_ciddc.ps	2006-10-10 10:03:37 UTC (rev 7095)
@@ -153,6 +153,7 @@
     /Korea1.Wansung  [ /KSCms-UHC-V /KSCms-UHC-H ]
     /Korea1.Unicode  [ /UniKS-UCS2-H /UniKS-UCS2-V ]
     /Identity.Symbol [ /Identity-H /Identity-V ]
+    /Unicode.Unicode [ /Identity-UTF16-H ]
   >> def
 
   /.MakeInstance    % <name> .MakeInstance <inst>

Modified: trunk/gs/lib/gs_fntem.ps
===================================================================
--- trunk/gs/lib/gs_fntem.ps	2006-10-06 18:53:35 UTC (rev 7094)
+++ trunk/gs/lib/gs_fntem.ps	2006-10-10 10:03:37 UTC (rev 7095)
@@ -88,10 +88,17 @@
 /TranslateCSI   % <record> TranslateCSI <CSI>
 { RESMPDEBUG { (fntem TranslateCSI beg ) = } if
   begin
-  << /Registry (Adobe) 
-     /Ordering CSI aload pop 
-     /Supplement exch 
-  >>
+  CSI length 2 eq {
+    << /Registry (Adobe) 
+       /Ordering CSI aload pop 
+       /Supplement exch 
+    >>
+  } {
+    << /Registry CSI 0 get
+       /Ordering CSI 1 get
+       /Supplement CSI 2 get
+    >>
+  } ifelse
   end
   RESMPDEBUG { (fntem TranslateCSI end ) = } if
 } bind def
@@ -170,6 +177,8 @@
 
   /Identity []
 
+  /Unicode []
+
 >> def
 setpacking
 

Modified: trunk/gs/src/gdevpdtc.c
===================================================================
--- trunk/gs/src/gdevpdtc.c	2006-10-06 18:53:35 UTC (rev 7094)
+++ trunk/gs/src/gdevpdtc.c	2006-10-10 10:03:37 UTC (rev 7095)
@@ -285,6 +285,7 @@
 		if (code < 0)
 		    return code;
 		pidcmap->CMapType = 2;	/* per PDF Reference */
+		pidcmap->ToUnicode = true;
 		code = pdf_cmap_alloc(pdev, pidcmap,
 				&pdev->Identity_ToUnicode_CMaps[pcmap->WMode], -1);
 		if (code < 0)

Modified: trunk/gs/src/gsfcmap.c
===================================================================
--- trunk/gs/src/gsfcmap.c	2006-10-06 18:53:35 UTC (rev 7094)
+++ trunk/gs/src/gsfcmap.c	2006-10-10 10:03:37 UTC (rev 7095)
@@ -81,6 +81,7 @@
 
 	memset(penum->range.first, 0, pcimap->num_bytes);
 	memset(penum->range.last, 0xff, pcimap->num_bytes);
+	penum->range.size = pcimap->num_bytes;
 	penum->index = 1;
 	return 0;
     }
@@ -360,6 +361,9 @@
  * For a random CMap, compute whether it is identity.
  * It is not applicable to gs_cmap_ToUnicode_t due to
  * different sizes of domain keys and range values.
+ * Note we reject CMaps with Registry=Artifex
+ * to force embedding special instandard CMaps,
+ * which are not commonly in use yet.
  */
 bool
 gs_cmap_compute_identity(const gs_cmap_t *pcmap, int font_index_only)
@@ -368,6 +372,9 @@
     gs_cmap_lookups_enum_t lenum;
     int code;
 
+    if (!bytes_compare(pcmap->CIDSystemInfo->Registry.data, pcmap->CIDSystemInfo->Registry.size,
+		    (const byte *)"Artifex", 7))
+	return false;
     for (gs_cmap_lookups_enum_init(pcmap, which, &lenum);
 	 (code = gs_cmap_enum_next_lookup(&lenum)) == 0; ) {
 	if (font_index_only >= 0 && lenum.entry.font_index != font_index_only)

Modified: trunk/gs/src/zfcmap.c
===================================================================
--- trunk/gs/src/zfcmap.c	2006-10-06 18:53:35 UTC (rev 7094)
+++ trunk/gs/src/zfcmap.c	2006-10-10 10:03:37 UTC (rev 7095)
@@ -457,6 +457,11 @@
 	goto fail;
     if ((code = acquire_code_map(&pcmap->notdef, &rnotdefs, pcmap, imemory)) < 0)
 	goto fail;
+    if (!bytes_compare(pcmap->CIDSystemInfo->Registry.data, pcmap->CIDSystemInfo->Registry.size,
+		    (const byte *)"Artifex", 7) &&
+	!bytes_compare(pcmap->CIDSystemInfo->Ordering.data, pcmap->CIDSystemInfo->Ordering.size,
+		    (const byte *)"Unicode", 7))
+	pcmap->from_Unicode = true;
     pcmap->mark_glyph = zfont_mark_glyph_name;
     pcmap->mark_glyph_data = 0;
     pcmap->glyph_name = zfcmap_glyph_name;



More information about the gs-cvs mailing list