[Gs-devel] refine MDRC patch

mpsuzuki at hiroshima-u.ac.jp mpsuzuki at hiroshima-u.ac.jp
Sat Apr 14 05:07:30 PDT 2001


Dear Mr. Igor,

Now I'm refining my MDRC (multi-dimensional range comparator) patch,
I have to ask a few questions for better implementation.

# MDRC patch is really my first programming in C,
# I'm an oldman living in Fortran77 world :-).

Best Wishes,
mpsuzuki

=======================================================================
For first, I have to note my terminology.
If there's any difference with GS codes,
it means that I'm misunderstanding GS,
so please correct.

code:
	the byte string written in CMap, and passed by endcidrange etc.
	In following CMap,

	1 begincidrange
	<2121> <217E>  0
	endcidrange

	I call <2121> as code_lo, and <217E> as code_hi.

prefix:
	In the internal expression of CMap for GS,
	"code" is decomposed into two parts.
	The common preceding bytes in the codes
	are called as "prefix".

key:
	The valiable bytes following to the "prefix" in the codes
	are called as "key".

value:
	The CID offset (in above example, "0"),
	glyph name and strings (in CMap for rearranged font)
	to be paired with "key" are called as "value".
----------------------------------------------------------------------

My implementation makes prefix as common longest bytes of <code_lo>
and <code_hi>, when <code_lo> != <code_hi>.

When <code_lo> == <code_hi>, how I make prefix & key ?

As a basic procedure, gs_cmap.ps generates 5 items (prefix, parameter,
key, value, map number (0=space, 1=defined CID, 2=undefined CID)) for
each range.

When decoded prefix and length of key/value are same with previous range,
(original) gs_cmap.ps appends current range specification into the items
of previous range. I call it as "merge". In following, I call the ranges
with same prefix and same length of key/value as "merge-able".

To decrease the size of /TempMaps (and, /.CodeMapData based on it),
I tried to maximize merge-ability as far as possible. When I wrote
a code ignoring merge-ability, some CMap (Chinese EUC) makes
/.TempMaps overflew.

For <code_lo> == <code_hi> case, if I make prefix = <code_lo>
(using same method for <code_lo> != <code_hi> case),
the resulted range is not merge-able at all,
because the resulted prefix is unique.

I wrote: make 1-byte prefix, and rest is used as key.
So, many "ranges" to specify single characters aslike

	<a1a1> <a1a1> 0
	<a1a2> <a1a2> 10
	<a1a3> <a1a3> 5
	<a1a4> <a1a4> 6
	....

can be merged as the ranges with common <a1> prefix.

I hard-wired "making 1-byte prefix" for <code_lo> == <code_hi>,
it does not work well for 1-byte range specification, and
I made a special branch for 1-byte range specification,
to treat as 0-byte prefix and 1-byte key.

However, of course, I'm not sure if this is good solution.
Do you have any good idea?

--------------------------------------------------------------------

>You suppose that the range [<0101> <0201>] covers 2 CIDs.
>Why not 256 ones ? Where did you take this knowledge from ?

I did small experience by Display PostScript on Solaris 2.8.
For first, I defined a "/Z" CMap including one range specification

	1 begincidrange
	<2121> <7E21>  6064
	endcidrange

Then, I composed a font as

	/Ryumin-Light-Z /Z
	/Ryumin-Light-H findfont /FDepVector get
	composefont

and, I printed a few characters by this font.

	100 scalefont setfont
	50 50 moveto
	<2121> show
	<2122> show
	<2221> show
	<2222> show
	<2321> show

For <2121>, <2221> and <2321>, corresponding kanjis are printed,
but for <2122> and <2222>, full-width space (default glyph for
undefined kanji characters) are printed out.

>From this result, I thought <0101> <0201> means only <0101> and <0201>.

However, I could not find explicit specification for such case
in Adobe documentations, so I wonder this is what Adobe thinks
"should be so".




More information about the gs-devel mailing list