[gs-cvs] rev 8535 - branches/mtrender/src

ray at ghostscript.com ray at ghostscript.com
Tue Feb 19 00:29:06 PST 2008


Author: ray
Date: 2008-02-19 00:29:04 -0800 (Tue, 19 Feb 2008)
New Revision: 8535

Added:
   branches/mtrender/src/gxclthrd.c
   branches/mtrender/src/gxclthrd.h
   branches/mtrender/src/gxclthrd1.c
Modified:
   branches/mtrender/src/gdevprn.c
   branches/mtrender/src/gdevprn.h
   branches/mtrender/src/gxclist.c
   branches/mtrender/src/gxclist.h
   branches/mtrender/src/gxclread.c
   branches/mtrender/src/lib.mak
   branches/mtrender/src/msvc32.mak
Log:
Initial functional commit of support for rendering bands in multiple
threads. Requires building with CLIST_THREADS=1 and is controlled
by command line / device parameter -dNumRenderingThreads=#.

DETAILS:

General Goals:

A. To allow multi-core/multi-processor systems to render bands using
   several concurrent threads. Customer #531 is driving a 600 dpi 32-bit
   CMYK device using a quad-core Xeon motherboard as a RIP to render
   PS and PDF documents up to 36" wide. With single threading, the CPU
   utilization remains at 25%.

B. The other goal established by the customer is that they strongly
   preferred minimizing the integration effort, and agreed that the
   best solution would be to have the multiple threads handled internal
   to the graphics library so their custom device driver could simply
   call 'gdev_prn_get_bits' as before.

   By complying with this aspect of the design goal, it also means that
   most, if not all existing devices can take advantage of multi-threaded
   rendering since most devices call 'gdev_prn_get_bits'.

C. My design goals included minimizing impact/changes/risk on existing
   code and providing for automatic fallback to the existing single
   threaded rendering. This was accomplished by hooking the device proc
   get_bits_rectangle and replacing clist_get_bits_rectangle with
   clist_get_bits_rectangle_mt. In the current code, this is done after
   verifying that the library/OS supports multi-threading (see below
   clist_enable_multi_thread_render).
  
D. A final design goal was to anticipate an improvement on the simple
   sequential look ahead for threads to allow for out of order completion
   of bands to be delivered via a callback into a procedure provided by
   the device (using a new function in the the clist rendering). This
   might be useful to customer #531 in a future implementation and allows
   post processing of a band in the same thread that rendered the band.
   This avoids stalls waiting for a band that is complex, allowing free
   threads to proceed rendering further in the page. It does impose the
   need on the post processing to collect bands of the page potentially
   out of order (the 'gather' of the scatter/gather multiprocessing that
   this provides). THIS FEATURE IS NOT PROVIDED IN THIS INITIAL VERSION,
   BUT REFERENCE TO THIS ARE MADE IN THE DETAILS BELOW.

Implementation notes:

1. top level *.mak files and lib.mak
   The CLIST_THREADS makefile macro defaults to an empty string, so the
   clthread.dev is the default. When CLIST_THREADS=1 clthread1.dev is used.
   This means that upper level makefiles don't need to be changed for the
   current, single threaded behavior.

   Refer to lib.mak for the bulk of the makefile changes. The top level
   makefiles also add a comment and a commented out CLIST_THREADS=1 for
   convenience although this macro can be specified on the make/nmake
   command line (not recommended). The top level makefiles also currently
   set the CFLAGS to define "USE_LOCKING_MEMORY_WRAPPER" when CLIST_THREADS=1
   and on unix and macos, set the 'SYNC' and 'STDLIBS' to use the posix
   lib thread support (I'm not sure if the reason for this not being standard
   as it is on Windows is still valid -- no support on HP-UX).

   Only msvc32.mak top level makefile is modified by this commit.
   TBD:
   src/bcwin32.mak src/dvx-gcc.mak src/macos-mcp.mak src/macosx.mak
   src/msvclib.mak src/openvms.mak src/os2.mak src/ugcclib.mak
   src/unix-gcc.mak src/unixansi.mak src/watclib.mak src/watcw32.mak

2. gxclthrd.c
   There is also a 'stub' module src/gxclthrd.c that implements the
   two exported functions: clist_enable_multi_thread_render  and
   clist_teardown_render_threads. The former returns -1 in case the caller
   cares, the latter is a NOOP.

3. gxclist.c
   The 'clist_init' function now sets the dev->procs to the gs_clist_device_procs,
   then calls 'clist_enable_multi_thread_render' to possibly hook in the
   multiple threaded rendering using the *get_bits_rectangle proc, setting it
   to 'clist_get_bits_rectangle_mt' is the platform and build supports multi
   threaded rendering. If multi threaded rendering cannot be supported, the
   procs are unchanged.

   The current code (gxclist.c::clist_init) ignores the return code from the
   'enable' call since rendering will work anyway (in single thread mode).

   The check in 'clist_finish_page' that detects switching from 'reader' to
   'writer' now (also) calls 'clist_teardown_render_threads' to clean up
   from mult-threaded rendering. This will not actually do anything if multiple
   threads had not been setup/started.

4. gxclist.h
   The values used by the multi-threaded rendering are added to the
   gx_device_clist_reader_s structure. The key value in starting the
   threads is the 'render_threads' pointer, which is set to NULL during
   the clist_render_init which is only called when switching from writer
   to reader by clist_close_writer_and_init_reader. The other values
   related to multi-threaded rendering are set by the clist_setup_render_threads
   (see below)

5. gdevprn.c, gdevprn.h
   Setting the device procs to the gs_clist_device_procs is moved into the
   clist_init code (where it belongs) so that the test for multi-threaded
   rendering is performed after the default, single threaded procs are set.

   Also support for a new parameter, NumRenderingThreads is added, setting
   the new 'num_render_threads_requested' parameter. The 'requested' is
   added since the parameter is supported even if the platform doesn't
   support multiple threads. Also fewer threads will actually be used if
   the number of threads requested exceeds the number of bands, or if the
   requested number of threads cannot be started (soft fail).

   In gdevprn.h, the parameter is added to the gx_prn_device_commomn macro
   used to define the gx_device_printer_s and to the initializer for this
   structure, the 'prn_device_body_rest2_' macro (setting the default value
   to 0).

7. gxclthrd1.c 
		   "enable"
   When CLIST_THREADS=1, the clist_enable_multi_thread_render firsts tests
   that threads are supported by testing that gp_create_thread succeeds
   because if the platform doesn't support a valid gp_*sync.c, this will
   fail.
   
		   "setup" (starting the multiple threads).
   An array of 'clist_thread_control_t' structures is allocated and the
   render_threads points to it. This array contains the per thread info
   needed for communication with the main thread and to render a band.

   There is a 'status' used to communicate the completion and error code
   (upon failure) a gx_semaphore_t for the thread that can be used to
   wait for the thread (sema_this). The status is marked BUSY when the
   thread is started/running, and normally is set to DONE when the thread
   completes. When the data is retrived from the thread's buffer area the
   status is set to IDLE.

   The structure also includes the clist device to be used for rendering and
   the buffer device for the thread (cdev and bdev, respectively). The 'band'
   parameter identifies the band to be rendered by the thread.

   There is also a 'sema_group' gx_semaphore_t that is provided for future
   out of order rendering that can be used in the future (see "D" above) to
   wait until at least one thread is not busy. The clist_render_thread signals
   this group semaphore as well as the individual semaphore prior to exit.
   An out-of-order dispatch function would wait on the group semaphore, then
   check the status of individual threads until it located one that is not
   busy.

   The thread's cdev (clist device) is set to the current configuration
   by a combination of gs_putdeviceparams and explicitly setting the
   'page_uses_transparency' flag which is needed for proper setup of the
   buffer space.

   When the thread is set up, the buffer space unique to that thread is
   allocated and initialized using gdev_prn_allocate_memory (to maximize
   code re-use) but a side effect is that new clist files are opened.
   These are immediately closed and the page_cfile and page_bfile pointers
   are set by opening the main thread's files in read mode. The remainder
   of the initialization for the 'reader' is done by clist_render_init.

   The thread's 'bdev' buf_device is created, but will be set up by the
   thread to reflect the band's boundaries.

   Then the thread is started to render the band and the loop continues to
   the next thread and next band (using the thread_lookahead_direction).

   If starting a thread fails, then the loop breaks early and any data/
   devices allocated for the thread are freed.

   If the loop exits with the loop counter at 0 (i == 0), then no threads
   were successfully started, so the _requested value is set to 0 (to
   prevent subsequent attempts to multi-thread on this page) and the single
   thread is used for rendering.

   The setup is complete setting the 'num_render_threads' to the number
   of threads successfully started, and the index for the 'first' thread
   started is set (== 0).
   
			   "start"
   There is a simple 'clist_start_render_thread' that sets up the control
   structure and starts the thread using 'gp_create_thread'

			   "render_thread"
   Each instance of a rendering thread executes 'clist_render_thread'. The
   thread is passed a pointer to its (unique) control/status structure that
   is an element of the crdev->render_threads array.

   The render_thread sets up its buf_device (bdev) for the num_lines in the
   band, then calls 'clist_render_rectangle' to actually render the band.
   After the render_rectangle completes it sets its 'ymin' and 'ymax' to
   reflect the bounds of the data in the buffer, puts the completion status
   into the thread control structure, and signals the semaphores. The 'group'
   semaphore is signalled first since a thread waiting for 'any' of the group
   will still have to check the status to find an idle thread.

   Note that the 'RENDER_THREAD_DONE' status implies that the data in the
   thread's buffer area is valid. Once the data has been accessed and either
   copied out of the buffer, or the buffer pointer has been changed, the
   status is changed to 'RENDER_THREAD_IDLE'.

   Note that the thread will use the tile_cache area set up within the
   buffer space during the 'setup' process (by gdev_prn_allocate_memory)
   even if the data area of the buffer has been changed between subsequent
   executions of a thread with a specific 'crdev'. This works because the
   thread executes 'setup_buf_device' each time with the 'mdata' pointer
   derived from the current crdev->data value.

			   "get_band"
   The 'clist_get_band_from_thread' is invoked by the 'main' thread from
   'clist_get_bits_rectangle_mt' to perform the thread synchronization
   when the data in the main thread's buffer (within the ymin::ymax) does
   not satisfy the data needed.

   The logic to find the thread that contains the needed band is temporarily
   rudimentary (and will be improved shortly).

   The semaphore for the thread corresponding to the band that is needed
   is used for a 'wait' (which returns immediately if the thread has finished)
   then the status is checked to ensure that there was no error.

   The 'data' area pointer for the thread is swapped with that of the thread
   that has finished so that the thread can be restarted on a different band
   while the data in the main thread remains intact (such as is needed for
   successive calls to gdev_prn_get_bits). The thread's status is flagged as
   'IDLE' (see above) since the data in the buffer no longer is valid and the
   'band' is set to -1 so that the main thread can check for a thread that
   is working on or has completed a band does not match any actual band.

			   "get_bits_rectangle_mt"
   This is largely copied from the default 'clist_get_bits_rectangle', which
   also may be called to revert to single threaded rendering under some
   conditions. MAYBE THIS REVERSION SHOULD ISSUE A MESSAGE TO stderr.
   
   When clist_get_bits_rectangle_mt is called, it checks if multiple threads
   are supported, or if the request is for 'PLANES' which is not yet
   supported. It reverts to calling the default single threaded clist_get
   _bits_rectangle.

   After the clist is ensured to be in 'reader' mode, the render_threads
   pointer is checked for NULL to determine if threads are already started.
   The clist_setup_render_threads' starts the threads if needed. Note that
   if at least one thead is not started, then the 'num_render_threads_requested'
   will be set to zero (to prevent subsequent entry) and the default clist
   _get_bits_rectangle is used.

   If the data is already within the bounds of the band previously rendered,
   checking 'ymin' and 'ymax', then we don't need to get data from a thread.
   This will be the path for the "normal" single line calls via gdev_prn_get_bits
   until the line is in a band that has not been rendered (it _may_ have been
   rendered in a different thread, but is not yet available to the main thread).

   If the data isn't already available in the main thread's 'data' area of
   the buffer, it calls 'clist_get_band_from_thread' above.

   Note that the 'mdata' area is set AFTER the call to get_band_from_thread
   since a buffer swap may have changed the crdev->dat pointer. THIS METHOD
   IS USED TO AVOID USING 'memcpy' SINCE THAT WAS DETERMINED TO SLOW DOWN
   MULTI-THREADED RENDERING COMPARED TO SINGLE THREAD.

   TBD: If the 'lines_rasterized' is less than the 'line_count' requested,
   we punt to single threaded rendering. This is untested, and DOES NOT AFFECT
   CALLS VIA 'gdev_prn_get_bits' (the usual case).

			"teardown"
   The 'clist_teardown_render_threads' is usually called after the page has
   been rendered, by 'clist_finish_page'. For each thread in the render_threads
   array, it waits till the thread has finished (usually they are) and then
   frees the elements allocated by clist_setup_render_threads, the two semaphores
   the buf_device and the buffer_space. Note that the 'gdev_prn_free' does NOT
   rely on the 'data' pointer in the clist_common part of the structure, but
   instead frees using the 'buf' pointer in the gx_prn_device_common. This means
   that the swapping of 'data' pointers in the clist part of the device structure
   will not cause the wrong memory area to be freed.



Modified: branches/mtrender/src/gdevprn.c
===================================================================
--- branches/mtrender/src/gdevprn.c	2008-02-19 08:22:12 UTC (rev 8534)
+++ branches/mtrender/src/gdevprn.c	2008-02-19 08:29:04 UTC (rev 8535)
@@ -355,9 +355,6 @@
 						  !bufferSpace_is_default);
 	    if (ecode == 0)
 		ecode = code;
-
-	    if ( code >= 0 || (reallocate && pass > 1) )
-		ppdev->procs = gs_clist_device_procs;
 	} else {
 	    /* Render entirely in memory. */
 	    gx_device *bdev = (gx_device *)pmemdev;
@@ -497,6 +494,7 @@
 	(code = param_write_int(plist, "BandWidth", &ppdev->space_params.band.BandWidth)) < 0 ||
 	(code = param_write_int(plist, "BandHeight", &ppdev->space_params.band.BandHeight)) < 0 ||
 	(code = param_write_long(plist, "BandBufferSpace", &ppdev->space_params.band.BandBufferSpace)) < 0 ||
+	(code = param_write_int(plist, "NumRenderingThreads", &ppdev->num_render_threads_requested)) < 0 ||
 	(code = param_write_bool(plist, "OpenOutputFile", &ppdev->OpenOutputFile)) < 0 ||
 	(code = param_write_bool(plist, "ReopenPerPage", &ppdev->ReopenPerPage)) < 0 ||
 	(code = param_write_bool(plist, "PageUsesTransparency",
@@ -542,6 +540,7 @@
     int duplex_set = -1;
     int width = pdev->width;
     int height = pdev->height;
+    int nthreads = ppdev->num_render_threads_requested;
     gdev_prn_space_params sp, save_sp;
     gs_param_string ofs;
     gs_param_dict mdict;
@@ -665,6 +664,16 @@
     read_media("InputAttributes");
     read_media("OutputAttributes");
 
+    switch (code = param_read_int(plist, (param_name = "NumRenderingThreads"), &nthreads)) {
+	case 0:
+	    break;
+	default:
+	    ecode = code;
+	    param_signal_error(plist, param_name, ecode);
+	case 1:
+	    ;
+    }
+
     if (ecode < 0)
 	return ecode;
     /* Prevent gx_default_put_params from closing the printer. */
@@ -682,6 +691,7 @@
 	ppdev->Duplex_set = duplex_set;
     }
     ppdev->space_params = sp;
+    ppdev->num_render_threads_requested = nthreads;
 
     /* If necessary, free and reallocate the printer memory. */
     /* Formerly, would not reallocate if device is not open: */

Modified: branches/mtrender/src/gdevprn.h
===================================================================
--- branches/mtrender/src/gdevprn.h	2008-02-19 08:22:12 UTC (rev 8534)
+++ branches/mtrender/src/gdevprn.h	2008-02-19 08:29:04 UTC (rev 8535)
@@ -246,6 +246,7 @@
 	gx_device_printer *async_renderer;	/* in async writer, pointer to async renderer */\
 	uint clist_disable_mask;	/* mask of clist options to disable */\
 		/* ---- End async rendering support --- */\
+	int num_render_threads_requested;	/* for multiple band rendering threads */\
 	gx_device_procs orig_procs	/* original (std_)procs */
 
 /* The device descriptor */
@@ -383,6 +384,7 @@
 	0/*false*/, -1,	/* Duplex[_set] */\
 	0/*false*/, 0, 0, 0, /* file_is_new ... buf */\
 	0, 0, 0, 0, 0/*false*/, 0, 0, /* buffer_memory ... clist_dis'_mask */\
+	0, 		/* num_render_threads_requested */\
 	{ 0 }	/* ... orig_procs */
 #define prn_device_body_rest_(print_page)\
   prn_device_body_rest2_(print_page, gx_default_print_page_copies)

Modified: branches/mtrender/src/gxclist.c
===================================================================
--- branches/mtrender/src/gxclist.c	2008-02-19 08:22:12 UTC (rev 8534)
+++ branches/mtrender/src/gxclist.c	2008-02-19 08:29:04 UTC (rev 8535)
@@ -471,7 +471,7 @@
  * Initialize the device state (for writing).  This routine requires only
  * data, data_size, and target to be set, and is idempotent.
  */
-static int
+int
 clist_init(gx_device * dev)
 {
     gx_device_clist_writer * const cdev =
@@ -625,8 +625,12 @@
     if (code < 0)
 	return code;
     code = clist_open_output_file(dev);
-    if ( code >= 0)
+    if ( code >= 0) {
 	code = clist_emit_page_header(dev);
+	dev->procs = gs_clist_device_procs;	/* Must be before enabling multi-threading */
+						/* which may change get_bits_rectangle */
+    }
+    clist_enable_multi_thread_render(dev);	/* if this fails, single thread will be used */
     return code;
 }
 
@@ -661,9 +665,12 @@
 
     /* If this is a reader clist, which is about to be reset to a writer,
      * free any band_complexity_array memory used by same.
+     * since we have been rendering, shut down threads
      */
-    if (!CLIST_IS_WRITER((gx_device_clist *)dev))
-       	gx_clist_reader_free_band_complexity_array( (gx_device_clist *)dev );
+    if (!CLIST_IS_WRITER((gx_device_clist *)dev)) {
+	gx_clist_reader_free_band_complexity_array( (gx_device_clist *)dev );
+	clist_teardown_render_threads(dev);
+    }
 
     if (flush) {
 	if (cdev->page_cfile != 0)

Modified: branches/mtrender/src/gxclist.h
===================================================================
--- branches/mtrender/src/gxclist.h	2008-02-19 08:22:12 UTC (rev 8534)
+++ branches/mtrender/src/gxclist.h	2008-02-19 08:29:04 UTC (rev 8535)
@@ -263,6 +263,11 @@
 #define clist_disable_pass_thru_params (1 << 5)	/* disable EXCEPT at top of page */
 #define clist_disable_copy_alpha (1 << 6) /* target does not support copy_alpha */
 
+#ifndef clist_render_thread_control_t_DEFINED
+#  define clist_render_thread_control_t_DEFINED
+typedef struct clist_render_thread_control_s clist_render_thread_control_t;
+#endif
+
 /* Define the state of a band list when reading. */
 /* For normal rasterizing, pages and num_pages are both 0. */
 typedef struct gx_device_clist_reader_s {
@@ -273,6 +278,11 @@
     int num_pages;
     gx_band_complexity_t *band_complexity_array;  /* num_bands elements */
     void *offset_map; /* Just against collecting the map as garbage. */
+    int num_render_threads;		/* number of threads being used */
+    clist_render_thread_control_t *render_threads;	/* array of threads */
+    byte *main_thread_data;		/* saved data pointer of main thread */
+    int curr_render_thread;		/* index into array */
+    int thread_lookahead_direction;	/* +1 or -1 */
 } gx_device_clist_reader;
 
 union gx_device_clist_s {
@@ -321,6 +331,9 @@
 
 void clist_init_io_procs(gx_device_clist *pclist_dev, bool in_memory);
 
+/* Initialize the clist data structures, but not the clist files */
+int clist_init(gx_device * dev);
+
 /* Reset (or prepare to append to) the command list after printing a page. */
 int clist_finish_page(gx_device * dev, bool flush);
 
@@ -350,6 +363,9 @@
 /* Do device setup from params passed in the command list. */
 int clist_setup_params(gx_device *dev);
 
+/* Initialize for reading. */
+int clist_render_init(gx_device_clist *dev);
+
 /*
  * Render a rectangle to a client-supplied image.  This implements
  * gdev_prn_render_rectangle for devices that are using banding.
@@ -383,6 +399,23 @@
 void 
 clist_copy_band_complexity(gx_band_complexity_t *this, const gx_band_complexity_t *from);
 
+/* Exports from gxclread used by the multi-threading logic */
+
+int 
+clist_close_writer_and_init_reader(gx_device_clist *cldev);
+
+void
+clist_select_render_plane(gx_device *dev, int y, int height,
+			  gx_render_plane_t *render_plane, int index);
+
+/* Enable multi threaded rendering. Returns > 0 if supported, < 0 if single threaded */
+int
+clist_enable_multi_thread_render(gx_device *dev);
+
+/* Shutdown render threads and free up the related memory */
+void
+clist_teardown_render_threads(gx_device *dev);
+
 #ifdef DEBUG 
 #define clist_debug_rect clist_debug_rect_imp
 void clist_debug_rect_imp(int x, int y, int width, int height);

Modified: branches/mtrender/src/gxclread.c
===================================================================
--- branches/mtrender/src/gxclread.c	2008-02-19 08:22:12 UTC (rev 8534)
+++ branches/mtrender/src/gxclread.c	2008-02-19 08:29:04 UTC (rev 8535)
@@ -278,12 +278,12 @@
 
 /* Forward references */
 
-static int clist_render_init(gx_device_clist *);
 static int clist_rasterize_lines(gx_device *dev, int y, int lineCount,
-				  gx_device *bdev,
-				  const gx_render_plane_t *render_plane,
-				  int *pmy);
+				gx_device *bdev,
+				const gx_render_plane_t *render_plane,
+				int *pmy);
 
+
 /* Calculate the raster for a chunky or planar device. */
 static int
 clist_plane_raster(const gx_device *dev, const gx_render_plane_t *render_plane)
@@ -294,7 +294,7 @@
 }
 
 /* Select full-pixel rendering if required for RasterOp. */
-static void
+void
 clist_select_render_plane(gx_device *dev, int y, int height,
 			  gx_render_plane_t *render_plane, int index)
 {
@@ -337,7 +337,7 @@
     return code;
 }
 
-static int 
+int 
 clist_close_writer_and_init_reader(gx_device_clist *cldev)
 {
     gx_device_clist_reader * const crdev = &cldev->reader;
@@ -355,7 +355,7 @@
 }
 
 /* Initialize for reading. */
-static int
+int
 clist_render_init(gx_device_clist *dev)
 {
     gx_device_clist_reader * const crdev = &dev->reader;
@@ -367,6 +367,7 @@
     crdev->num_pages = 0;
     crdev->band_complexity_array = NULL;
     crdev->offset_map = NULL;
+    crdev->render_threads = NULL;
     return gx_clist_reader_read_band_complexity(dev);
 }
 

Added: branches/mtrender/src/gxclthrd.c
===================================================================
--- branches/mtrender/src/gxclthrd.c	2008-02-19 08:22:12 UTC (rev 8534)
+++ branches/mtrender/src/gxclthrd.c	2008-02-19 08:29:04 UTC (rev 8535)
@@ -0,0 +1,29 @@
+/* Copyright (C) 2001-2006 Artifex Software, Inc.
+   All Rights Reserved.
+  
+   This software is provided AS-IS with no warranty, either express or
+   implied.
+
+   This software is distributed under license and may not be copied, modified
+   or distributed except as expressly authorized under the terms of that
+   license.  Refer to licensing information at http://www.artifex.com/
+   or contact Artifex Software, Inc.,  7 Mt. Lassen Drive - Suite A-134,
+   San Rafael, CA  94903, U.S.A., +1(415)492-9861, for further information.
+*/
+
+/*$Id$ */
+/* Command list - dummy thread hook */
+#include "gx.h"
+#include "gxdevice.h"
+#include "gxclist.h"
+
+int 
+clist_enable_multi_thread_render(gx_device *dev)
+{   
+    return -1;
+}
+
+void
+clist_teardown_render_threads(gx_device *dev)
+{
+}


Property changes on: branches/mtrender/src/gxclthrd.c
___________________________________________________________________
Name: svn:executable
   + *

Added: branches/mtrender/src/gxclthrd.h
===================================================================
--- branches/mtrender/src/gxclthrd.h	2008-02-19 08:22:12 UTC (rev 8534)
+++ branches/mtrender/src/gxclthrd.h	2008-02-19 08:29:04 UTC (rev 8535)
@@ -0,0 +1,42 @@
+/* Copyright (C) 2001-2006 Artifex Software, Inc.
+   All Rights Reserved.
+  
+   This software is provided AS-IS with no warranty, either express or
+   implied.
+
+   This software is distributed under license and may not be copied, modified
+   or distributed except as expressly authorized under the terms of that
+   license.  Refer to licensing information at http://www.artifex.com/
+   or contact Artifex Software, Inc.,  7 Mt. Lassen Drive - Suite A-134,
+   San Rafael, CA  94903, U.S.A., +1(415)492-9861, for further information.
+*/
+
+/* $Id$ */
+/* Command list multiple rendering threads */
+/* Requires gxsync.h */
+
+#ifndef gxclthrd_INCLUDED
+#  define gxcthrd_INCLUDED
+
+#include "gxsync.h"
+
+#define RENDER_THREAD_IDLE 0
+#define RENDER_THREAD_DONE 1
+#define RENDER_THREAD_BUSY 2
+
+#ifndef clist_render_thread_control_t_DEFINED
+#  define clist_render_thread_control_t_DEFINED
+typedef struct clist_render_thread_control_s clist_render_thread_control_t;
+#endif
+
+typedef struct clist_render_thread_control_s {
+    int status;	/* 0: not started, 1: done, 2: busy, < 0: error */ 
+		/* values allow waiting until status < 2 */
+    gx_semaphore_t *sema_this;
+    gx_semaphore_t *sema_group;
+    gx_device *cdev;	/* clist device copy */
+    gx_device *bdev;	/* this thread's buffer device */
+    int band;
+};
+
+#endif /* gxclthrd_INCLUDED */


Property changes on: branches/mtrender/src/gxclthrd.h
___________________________________________________________________
Name: svn:executable
   + *

Added: branches/mtrender/src/gxclthrd1.c
===================================================================
--- branches/mtrender/src/gxclthrd1.c	2008-02-19 08:22:12 UTC (rev 8534)
+++ branches/mtrender/src/gxclthrd1.c	2008-02-19 08:29:04 UTC (rev 8535)
@@ -0,0 +1,520 @@
+/* Copyright (C) 2001-2006 Artifex Software, Inc.
+   All Rights Reserved.
+  
+   This software is provided AS-IS with no warranty, either express or
+   implied.
+
+   This software is distributed under license and may not be copied, modified
+   or distributed except as expressly authorized under the terms of that
+   license.  Refer to licensing information at http://www.artifex.com/
+   or contact Artifex Software, Inc.,  7 Mt. Lassen Drive - Suite A-134,
+   San Rafael, CA  94903, U.S.A., +1(415)492-9861, for further information.
+*/
+
+/*$Id$ */
+/* Command list - Support for multiple rendering threads */
+#include "memory_.h"
+#include "gx.h"
+#include "gpcheck.h"
+#include "gxsync.h"
+#include "gserrors.h"
+#include "gxdevice.h"
+#include "gsdevice.h"
+#include "gscoord.h"		/* requires gsmatrix.h */
+#include "gxdevmem.h"		/* must precede gxcldev.h */
+#include "gdevprn.h"		/* must precede gxcldev.h */
+#include "gxcldev.h"
+#include "gxgetbit.h"
+#include "gdevplnx.h"
+#include "gsmemory.h"
+#include "gxclthrd.h"
+
+/* Forward reference prototypes */
+static int clist_start_render_thread(gx_device *dev, int thread_index, int band);
+static void clist_render_thread(void *param);
+
+
+/* Set up and start the render threads */
+static int
+clist_setup_render_threads(gx_device *dev, int y)
+{
+    gx_device_printer *pdev = (gx_device_printer *)dev;
+    gx_device_clist *cldev = (gx_device_clist *)dev;
+    gx_device_clist_common *cdev = (gx_device_clist_common *)cldev;
+    gx_device_clist_reader *crdev = &cldev->reader;
+    gs_memory_t *mem = cdev->bandlist_memory;
+    gx_device *protodev;
+    gs_c_param_list paramlist;
+    int i, code, band;
+    int band_count = cdev->nbands;
+    char fmode[4];
+
+    crdev->num_render_threads = pdev->num_render_threads_requested;
+
+    if(gs_debug[':'] != 0)
+	dprintf1("Attempting to set up %d rendering threads\n", pdev->num_render_threads_requested);
+
+    if (crdev->num_render_threads > band_count)
+	crdev->num_render_threads = band_count;	/* don't bother starting more threads than bands */
+
+    /* Allocate and initialize an array of thread control structures */
+    crdev->render_threads = (clist_render_thread_control_t *)
+	      gs_alloc_byte_array(mem, crdev->num_render_threads,
+	      sizeof(clist_render_thread_control_t), "clist_setup_render_threads" );
+    /* fallback to non-threaded if allocation fails */
+    if (crdev->render_threads == NULL)
+	return_error(gs_error_VMerror);
+
+    memset(crdev->render_threads, 0, crdev->num_render_threads *
+	    sizeof(clist_render_thread_control_t));
+    crdev->main_thread_data = cdev->data;		/* save data area */
+    /* Based on the line number requested, decide the order of band rendering */
+    if (y == 0) {
+	crdev->thread_lookahead_direction = 1;
+	band = 0;
+    } else {
+	crdev->thread_lookahead_direction = -1;
+	band = band_count;
+    }
+
+    /* Close the files so we can open them in multiple threads */
+    /* TODO: This doesn't work for memfile clist yet, so will fail */
+    if ((code = cdev->page_info.io_procs->fclose(cdev->page_cfile, cdev->page_cfname, false)) < 0 ||
+        (code = cdev->page_info.io_procs->fclose(cdev->page_bfile, cdev->page_bfname, false)) < 0) {
+	gs_free_object(mem, crdev->render_threads, "clist_setup_render_threads");
+	crdev->render_threads = NULL;
+        return_error(gs_error_unknownerror); /* shouldn't happen */
+    }
+    cdev->page_cfile = cdev->page_bfile = NULL;
+    strcpy(fmode, "r");			/* read access for threads */
+    strcat(fmode, gp_fmode_binary_suffix);
+    /* Find the prototype for this device (needed so we can copy from it) */
+    for (i=0; (protodev = (gx_device *)gs_getdevice(i)) != NULL; i++)
+	if (strcmp(protodev->dname, dev->dname) == 0)
+	    break;
+    if (protodev == NULL)
+	return gs_error_rangecheck;
+
+    gs_c_param_list_write(&paramlist, mem);
+    if ((code = gs_getdeviceparams(dev, (gs_param_list *)&paramlist)) < 0)
+	return code;
+
+    /* Loop creating the devices and semaphores for each thread, then start them */
+    for (i=0; i < crdev->num_render_threads; i++, band += crdev->thread_lookahead_direction) {
+	gx_device *ndev;
+	gx_device_clist *ncldev;
+	gx_device_clist_common *ncdev;
+	clist_render_thread_control_t *thread = &(crdev->render_threads[i]);
+
+        thread->band = -1;		/* a value that won't match any valid band */
+	if ((code = gs_copydevice((gx_device **) &ndev, protodev, mem)) < 0) {
+	    code = 0;		/* even though we failed, no cleanup needed */
+	    break;
+	}
+	ncldev = (gx_device_clist *)ndev;
+	ncdev = (gx_device_clist_common *)ndev;
+	gx_device_fill_in_procs(ndev);
+	((gx_device_printer *)ncdev)->buffer_memory = ncdev->memory =
+		ncdev->bandlist_memory = mem;
+	gs_c_param_list_read(&paramlist);
+	if ((code = gs_putdeviceparams(ndev, (gs_param_list *)&paramlist)) < 0)
+	    break;
+	ncdev->page_uses_transparency = cdev->page_uses_transparency;
+	/* gdev_prn_allocate_memory sets the clist for writing, creating new files.
+	 * We need  to unlink those files and open the main thread's files, then
+	 * reset the clist state for reading/rendering
+	 */
+	if ((code = gdev_prn_allocate_memory(ndev, NULL, 0, 0)) < 0)
+	    break;
+	thread->cdev = ndev;
+	/* close and unlink the temp files just created */
+	cdev->page_info.io_procs->fclose(ncdev->page_cfile, ncdev->page_cfname, true);
+	cdev->page_info.io_procs->fclose(ncdev->page_bfile, ncdev->page_bfname, true);
+	/* open the main thread's files for this thread */
+	if ((code=cdev->page_info.io_procs->fopen(cdev->page_cfname, fmode, &ncdev->page_cfile,
+			    mem, mem, true)) < 0 ||
+	     (code=cdev->page_info.io_procs->fopen(cdev->page_bfname, fmode, &ncdev->page_bfile,
+			    mem, mem, false)) < 0)
+	    break;
+	clist_render_init(ncldev);	/* Initialize clist device for reading */
+	ncdev->page_bfile_end_pos = cdev->page_bfile_end_pos;
+
+	/* create the buf device for this thread, and allocate the semaphores */
+	if ((code = gdev_create_buf_device(cdev->buf_procs.create_buf_device,
+				&(thread->bdev),
+				cdev->target, NULL,
+				mem, clist_get_band_complexity(dev,y)) < 0)) 
+	    break;
+	if ((thread->sema_this = gx_semaphore_alloc(mem)) == NULL ||
+	    (thread->sema_group = gx_semaphore_alloc(mem)) == NULL) {
+	    code = gs_error_VMerror;
+	    break;
+	}
+	/* Start thread 'i' to do band */
+	if ((code = clist_start_render_thread(dev, i, band)) < 0)
+	    break;
+    }
+    gs_c_param_list_release(&paramlist);
+    /* If the code < 0, the last thread creation failed -- clean it up */
+    if (code < 0) {
+	/* the following relies on 'free' ignoring NULL pointers */
+	gx_semaphore_free(crdev->render_threads[i].sema_group); 
+	gx_semaphore_free(crdev->render_threads[i].sema_this); 
+	if (crdev->render_threads[i].bdev != NULL)
+	    cdev->buf_procs.destroy_buf_device(crdev->render_threads[i].bdev);
+	if (crdev->render_threads[i].cdev != NULL) {
+	    gdev_prn_free_memory((gx_device *)(crdev->render_threads[i].cdev));
+	    gs_free_object(mem, crdev->render_threads[i].cdev, "clist_setup_render_threads");
+	}
+    }
+    /* If we weren't able to create at least one thread, punt	*/
+    /* Although a single thread isn't any more efficient, the	*/
+    /* machinery still works, so that's OK.			*/
+    if (i == 0) {
+	gs_free_object(mem, crdev->render_threads, "clist_setup_render_threads");
+	crdev->render_threads = NULL;
+	pdev->num_render_threads_requested = 0;	/* shut down thread support */
+	return_error(gs_error_VMerror);
+    }
+    crdev->num_render_threads = i;
+    crdev->curr_render_thread = 0;
+
+    if(gs_debug[':'] != 0)
+	dprintf1("Using %d rendering threads\n", i);
+
+    return 0;
+}
+
+void
+clist_teardown_render_threads(gx_device *dev)
+{
+    gx_device_clist *cldev = (gx_device_clist *)dev;
+    gx_device_clist_common *cdev = (gx_device_clist_common *)dev;
+    gx_device_clist_reader *crdev = &cldev->reader;
+    gs_memory_t *mem = cdev->bandlist_memory;
+    int i;
+
+    if (crdev->render_threads != NULL) {
+
+	/* Wait for each thread to finish then free its memory */
+	for (i=0; i < crdev->num_render_threads; i++) {
+	    clist_render_thread_control_t *thread = &(crdev->render_threads[i]);
+	    gx_device_clist_common *thread_cdev = (gx_device_clist_common *)thread->cdev;
+
+	    if (thread->status == RENDER_THREAD_BUSY)
+		gx_semaphore_wait(thread->sema_this);
+	    /* Free control semaphores */
+	    gx_semaphore_free(thread->sema_group);
+	    gx_semaphore_free(thread->sema_this);
+	    /* destroy the thread's buffer device */
+	    cdev->buf_procs.destroy_buf_device(thread->bdev);
+	    /*
+	     * Free the BufferSpace, close the band files 
+	     * Note that the BufferSpace is freed using 'ppdev->buf' so the 'data'
+	     * pointer doesn't need to be the one that the thread started with
+	     */
+	    gdev_prn_free_memory(thread->cdev);
+	    /* Free the device copy this thread used */
+	    gs_free_object(mem, thread->cdev, "clist_teardown_render_threads");
+	}
+	cdev->data = crdev->main_thread_data;	/* restore the pointer for writing */
+	gs_free_object(mem, crdev->render_threads, "clist_teardown_render_threads");
+	crdev->render_threads = NULL;
+
+	/* Now re-open the clist temp files so we can write to them */
+	if (cdev->page_cfile == NULL) {
+	    char fmode[4];
+
+	    strcpy(fmode, "w+");
+	    strcat(fmode, gp_fmode_binary_suffix);
+	    cdev->page_info.io_procs->fopen(cdev->page_cfname, fmode, &cdev->page_cfile,
+				mem, cdev->bandlist_memory, true);
+	    cdev->page_info.io_procs->fopen(cdev->page_bfname, fmode, &cdev->page_bfile,
+				mem, cdev->bandlist_memory, false);
+	}
+    }
+}
+
+static int
+clist_start_render_thread(gx_device *dev, int thread_index, int band)
+{
+    gx_device_clist *cldev = (gx_device_clist *)dev;
+    gx_device_clist_reader *crdev = &cldev->reader;
+    int code;
+
+    crdev->render_threads[thread_index].band = band;
+    crdev->render_threads[thread_index].status = RENDER_THREAD_BUSY;
+    
+    /* Finally, fire it up */
+    code = gp_create_thread(clist_render_thread, &(crdev->render_threads[thread_index]));
+
+    return code;
+}
+
+static void
+clist_render_thread(void *data)
+{
+    clist_render_thread_control_t *thread = (clist_render_thread_control_t *)data;
+    gx_device *dev = thread->cdev;
+    gx_device_clist *cldev = (gx_device_clist *)dev;
+    gx_device_clist_reader *crdev = &cldev->reader;
+    gx_device *bdev = thread->bdev;
+    gs_int_rect band_rect;
+    byte *mdata = crdev->data + crdev->page_tile_cache_size;
+    uint raster = bitmap_raster(dev->width * dev->color_info.depth);
+    int code;
+    int band_height = crdev->page_band_height;
+    int band = thread->band;
+    int band_begin_line = band * band_height;
+    int band_end_line = band_begin_line + band_height;
+    int band_num_lines;
+
+    if (band_end_line > dev->height)
+	band_end_line = dev->height;
+    band_num_lines = band_end_line - band_begin_line;
+
+    code = crdev->buf_procs.setup_buf_device
+	    (bdev, mdata, raster, NULL, 0, band_num_lines, band_num_lines);
+    band_rect.p.x = 0;
+    band_rect.p.y = band_begin_line;
+    band_rect.q.x = dev->width;
+    band_rect.q.y = band_end_line;
+    if (code >= 0)
+	code = clist_render_rectangle(cldev, &band_rect, bdev, NULL, true);
+    /* Reset the band boundaries now */
+    crdev->ymin = band_begin_line;
+    crdev->ymax = band_end_line;
+    crdev->offset_map = NULL;
+    if (code < 0)
+	thread->status = code;		/* shouldn't happen */
+    else
+	thread->status = RENDER_THREAD_DONE;	/* OK */
+
+    /*
+     * Signal the semaphores. We signal the 'group' first since even if
+     * the waiter is released on the group, it still needs to check
+     * status on the thread
+     */
+    gx_semaphore_signal(thread->sema_group);
+    gx_semaphore_signal(thread->sema_this);
+}
+
+/*
+ * Copy the raster data from the completed thread to the caller's
+ * device (the main thread)
+ * Return 0 if OK, < 0 is the error code from the thread 
+ *
+ * After swapping the pointers, start up the completed thread with the
+ * next band remaining to do (if any)
+ */
+static int
+clist_get_band_from_thread(gx_device *dev, int band)
+{
+    gx_device_clist *cldev = (gx_device_clist *)dev;
+    gx_device_clist_common *cdev = (gx_device_clist_common *)dev;
+    gx_device_clist_reader *crdev = &cldev->reader;
+    int next_band, code = 0;
+    int thread_index = crdev->curr_render_thread;
+    clist_render_thread_control_t *thread = &(crdev->render_threads[thread_index]);
+    gx_device_clist_common *thread_cdev = (gx_device_clist_common *)thread->cdev;
+    int band_height = crdev->page_info.band_params.BandHeight;
+    int band_count = cdev->nbands;
+    byte *tmp;			/* for swapping data areas */
+
+    /* We expect that the thread needed will be the 'current' thread */
+    if (thread->band != band) {
+	/*
+	 *TODO: maybe we should search for it, and if not found wait for
+	 * and idle thread and start that one
+	 */
+	eprintf2("clist_get_band_from_thread: at band %d, needed band %d\n",
+		thread->band, band);
+        return_error(gs_error_rangecheck);
+    }
+    /* Wait for this thread */
+    gx_semaphore_wait(thread->sema_this);
+    if (thread->status < 0)
+	return thread->status;		/* FAIL */
+
+    /* Swap the data areas to avoid the copy */
+    tmp = cdev->data;
+    cdev->data = thread_cdev->data;
+    thread_cdev->data = tmp;
+    thread->status = RENDER_THREAD_IDLE;	/* the data is no longer valid */
+    thread->band = -1;
+    /* Update the bounds for this band */
+    cdev->ymin =  band * band_height;
+    cdev->ymax =  cdev->ymin + band_height;
+    if (cdev->ymax > dev->height)
+	cdev->ymax = dev->height;
+
+    /* If we are not at the final band, start up this thread with the next one to do */
+    next_band = band + (crdev->num_render_threads * crdev->thread_lookahead_direction);
+    if (next_band > 0 && next_band < band_count)
+	code = clist_start_render_thread(dev, thread_index, next_band);
+    /* bump the 'curr' to the next thread */
+    crdev->curr_render_thread = crdev->curr_render_thread == crdev->num_render_threads - 1 ?
+		0 : crdev->curr_render_thread + 1;
+
+    return code;
+}
+
+/* Copy a rasterized rectangle to the client, rasterizing if needed. */
+/* The first invocation starts multiple threads to perform "look ahead" */
+/* rendering adjacent to the first band (forward or backward) */
+static int
+clist_get_bits_rect_mt(gx_device *dev, const gs_int_rect * prect,
+			 gs_get_bits_params_t *params, gs_int_rect **unread)
+{
+    gx_device_printer *pdev = (gx_device_printer *)dev;
+    gx_device_clist *cldev = (gx_device_clist *)dev;
+    gx_device_clist_common *cdev = (gx_device_clist_common *)dev;
+    gx_device_clist_reader *crdev = &cldev->reader;
+    gs_memory_t *mem = cdev->bandlist_memory;
+    gs_get_bits_options_t options = params->options;
+    int y = prect->p.y;
+    int end_y = prect->q.y;
+    int line_count = end_y - y;
+    int band_height = crdev->page_info.band_params.BandHeight;
+    int band = y / band_height;
+    gs_int_rect band_rect;
+    int lines_rasterized;
+    gx_device *bdev;
+    byte *mdata;
+    uint raster = bitmap_raster(dev->width * dev->color_info.depth);
+    int my;
+    int code = 0;
+
+    /* This page might not want multiple threads */
+    /* Also we don't support plane extraction using multiple threads */
+    if (pdev->num_render_threads_requested < 1 || (options & GB_SELECT_PLANES))
+	return clist_get_bits_rectangle(dev, prect, params, unread);
+
+    if (prect->p.x < 0 || prect->q.x > dev->width ||
+	y < 0 || end_y > dev->height
+	)
+	return_error(gs_error_rangecheck);
+    if (line_count <= 0 || prect->p.x >= prect->q.x)
+	return 0;
+
+    if((code = clist_close_writer_and_init_reader(cldev)) < 0)
+	return code;
+    
+    if (crdev->render_threads == NULL) {
+        if ((code = clist_setup_render_threads(dev, y)) < 0) {
+	    /* revert to the default single threaded rendering */
+	    return clist_get_bits_rectangle(dev, prect, params, unread);
+	}
+    } 
+    /* If we already have the band's data, just return it */
+    if (y < crdev->ymin || end_y > crdev->ymax)
+	code = clist_get_band_from_thread(dev, band);
+    if (code < 0)
+	goto free_thread_out;
+    mdata = crdev->data + crdev->page_tile_cache_size;
+    if ((code = gdev_create_buf_device(cdev->buf_procs.create_buf_device,
+				  &bdev, cdev->target, NULL,
+				  mem, clist_get_band_complexity(dev,y))) < 0 ||
+	(code = crdev->buf_procs.setup_buf_device(bdev, mdata, raster, NULL,
+			    y - crdev->ymin, line_count, crdev->ymax - crdev->ymin)) < 0)
+	goto free_thread_out;
+
+    lines_rasterized = min(band_height, line_count);
+    /* Return as much of the rectangle as falls within the rasterized lines. */
+    band_rect = *prect;
+    band_rect.p.y = 0;
+    band_rect.q.y = lines_rasterized;
+    code = dev_proc(bdev, get_bits_rectangle)
+	(bdev, &band_rect, params, unread);
+    cdev->buf_procs.destroy_buf_device(bdev);
+    if (code < 0)
+	goto free_thread_out;
+
+    /* Note that if called via 'get_bits', the line count will always be 1 */
+    if (lines_rasterized == line_count) {
+	return code;		
+    }
+
+/***** TODO: Handle the below with data from the threads *****/
+    /*
+     * We'll have to return the rectangle in pieces.  Force GB_RETURN_COPY
+     * rather than GB_RETURN_POINTER, and require all subsequent pieces to
+     * use the same values as the first piece for all of the other format
+     * options.  If copying isn't allowed, or if there are any unread
+     * rectangles, punt.
+     */
+    if (!(options & GB_RETURN_COPY) || code > 0)
+	return gx_default_get_bits_rectangle(dev, prect, params, unread);
+    options = params->options;
+    if (!(options & GB_RETURN_COPY)) {
+	/* Redo the first piece with copying. */
+	params->options = options =
+	    (params->options & ~GB_RETURN_ALL) | GB_RETURN_COPY;
+	lines_rasterized = 0;
+    }
+    {
+	gs_get_bits_params_t band_params;
+	uint raster = gx_device_raster(bdev, true);
+
+	code = gdev_create_buf_device(cdev->buf_procs.create_buf_device,
+				      &bdev, cdev->target, NULL,
+				      mem, clist_get_band_complexity(dev, y));
+	if (code < 0)
+	    return code;
+	band_params = *params;
+	while ((y += lines_rasterized) < end_y) {
+	    /* Increment data pointer by lines_rasterized. */
+	    if (band_params.data)
+		band_params.data[0] += raster * lines_rasterized;
+	    line_count = end_y - y;
+	    // code = clist_rasterize_lines(dev, y, line_count, bdev, NULL, &my);
+	    if (code < 0)
+		break;
+	    lines_rasterized = min(code, line_count);
+	    band_rect.p.y = my;
+	    band_rect.q.y = my + lines_rasterized;
+	    code = dev_proc(bdev, get_bits_rectangle)
+		(bdev, &band_rect, &band_params, unread);
+	    if (code < 0)
+		break;
+	    params->options = options = band_params.options;
+	    if (lines_rasterized == line_count)
+		break;
+	}
+	cdev->buf_procs.destroy_buf_device(bdev);
+    }
+    return code;
+
+/* Free up thread stuff */
+free_thread_out:
+    clist_teardown_render_threads(dev);
+    return code;
+}
+
+static void
+test_threads(void *dummy)
+{
+}
+
+int 
+clist_enable_multi_thread_render(gx_device *dev)
+{   
+    gx_device_printer *pdev = (gx_device_printer *)dev;
+    int code;
+
+    /* We need to test gp_create_thread since we may be on a platform */
+    /* built without working threads, i.e., using gp_nsync.c dummy    */
+    /* routines. The nosync gp_create_thread returns a -ve error code */
+    if ((code = gp_create_thread(test_threads, NULL)) < 0) {
+	if (gs_debug[':'] != 0)
+	    dprintf("Using single threaded rendering\n");
+	pdev->num_render_threads_requested = 0;
+	return code;	/* Threads don't work */
+    }
+
+    if (gs_debug[':'] != 0)
+	dprintf("Multi threaded rendering enabled.\n");
+    set_dev_proc(dev, get_bits_rectangle, clist_get_bits_rect_mt);
+
+    return 1;
+}


Property changes on: branches/mtrender/src/gxclthrd1.c
___________________________________________________________________
Name: svn:executable
   + *

Modified: branches/mtrender/src/lib.mak
===================================================================
--- branches/mtrender/src/lib.mak	2008-02-19 08:22:12 UTC (rev 8534)
+++ branches/mtrender/src/lib.mak	2008-02-19 08:29:04 UTC (rev 8535)
@@ -129,6 +129,7 @@
 gx_h=$(GLSRC)gx.h $(stdio__h) $(gdebug_h)\
  $(gserror_h) $(gsio_h) $(gsmemory_h) $(gstypes_h)
 gxsync_h=$(GLSRC)gxsync.h $(gpsync_h) $(gsmemory_h)
+gxclthrd_h=$(GLSRC)gxclthrd.h $(GLSRC)gxsync.h $(gpsync_h) $(gsmemory_h)
 # Out of order
 gsmemlok_h=$(GLSRC)gsmemlok.h $(gsmemory_h) $(gxsync_h)
 gsnotify_h=$(GLSRC)gsnotify.h $(gsstype_h)
@@ -1642,7 +1643,7 @@
 # clfile works for page clist iff it is included.
 
 $(GLD)clist.dev : $(LIB_MAK) $(ECHOGS_XE) $(clist_)\
- $(GLD)cl$(BAND_LIST_STORAGE).dev\
+ $(GLD)cl$(BAND_LIST_STORAGE).dev $(GLD)clthread$(CLIST_THREADS).dev\
  $(GLD)clmemory.dev\
  $(GLD)cfe.dev $(GLD)cfd.dev $(GLD)rle.dev $(GLD)rld.dev $(GLD)psl2cs.dev
 	$(SETMOD) $(GLD)clist $(clbase1_)
@@ -1650,7 +1651,7 @@
 	$(ADDMOD) $(GLD)clist -obj $(clbase3_)
 	$(ADDMOD) $(GLD)clist -obj $(clbase4_)
 	$(ADDMOD) $(GLD)clist -obj $(clpath_)
-	$(ADDMOD) $(GLD)clist -include $(GLD)cl$(BAND_LIST_STORAGE)
+	$(ADDMOD) $(GLD)clist -include $(GLD)cl$(BAND_LIST_STORAGE) $(GLD)clthread$(CLIST_THREADS)
 	$(ADDMOD) $(GLD)clist -include $(GLD)clmemory
 	$(ADDMOD) $(GLD)clist -include $(GLD)cfe $(GLD)cfd $(GLD)rle $(GLD)rld $(GLD)psl2cs
 
@@ -1756,6 +1757,22 @@
  $(gsmemory_h) $(gstypes_h) $(gxclmem_h) $(szlibx_h)
 	$(GLCC) $(GLO_)gxclzlib.$(OBJ) $(C_) $(GLSRC)gxclzlib.c
 
+# Dummy module - clist rendering in same thread as graphics library
+$(GLD)clthread.dev: $(GLOBJ)gxclthrd.$(OBJ) 
+	$(SETMOD) $(GLD)clthread $(GLOBJ)gxclthrd.$(OBJ)
+
+$(GLOBJ)gxclthrd.$(OBJ) :  $(GLSRC)gxclthrd.c $(gxclist_h)
+	$(GLCC) $(GLO_)gxclthrd.$(OBJ) $(C_) $(GLSRC)gxclthrd.c
+
+# Support for multiple clist rendering threads.
+$(GLD)clthread1.dev: $(GLOBJ)gxclthrd1.$(OBJ) $(GLD)$(SYNC).dev
+	$(SETMOD) $(GLD)clthread1 $(GLOBJ)gxclthrd1.$(OBJ)
+	$(ADDMOD) $(GLD)clthread1 -include $(GLD)$(SYNC).dev
+
+$(GLOBJ)gxclthrd1.$(OBJ) :  $(GLSRC)gxclthrd1.c $(gxclist_h) $(gxsync_h) $(gxclthrd_h)
+	$(GLCC) $(GLO_)gxclthrd1.$(OBJ) $(C_) $(GLSRC)gxclthrd1.c
+
+
 # ---------------- Vector devices ---------------- #
 # We include this here for the same reasons as page.dev.
 
@@ -1787,7 +1804,7 @@
 	$(GLCC) $(GLO_)siinterp.$(OBJ) $(C_) $(GLSRC)siinterp.c
 
 $(GLOBJ)siscale.$(OBJ) : $(GLSRC)siscale.c $(AK)\
- $(math__h) $(memory__h) $(stdio__h) $(gdebug_h)\
+ $(math__h) $(memory__h) $(stdio__h) $(stdint__h) $(gdebug_h)\
  $(siscale_h) $(strimpl_h)
 	$(GLCC) $(GLO_)siscale.$(OBJ) $(C_) $(GLSRC)siscale.c
 
@@ -2284,7 +2301,7 @@
 	$(ADDMOD) $(GLD)psl2lib -include $(GLD)colimlib $(GLD)psl2cs
 
 $(GLOBJ)gxiscale.$(OBJ) : $(GLSRC)gxiscale.c $(GXERR)\
- $(math__h) $(memory__h) $(gpcheck_h)\
+ $(math__h) $(memory__h) $(stdint__h) $(gpcheck_h)\
  $(gsccolor_h) $(gspaint_h)\
  $(gxarith_h) $(gxcmap_h) $(gxcpath_h) $(gxdcolor_h) $(gxdevice_h)\
  $(gxdevmem_h) $(gxfixed_h) $(gxfrac_h) $(gximage_h) $(gxistate_h)\

Modified: branches/mtrender/src/msvc32.mak
===================================================================
--- branches/mtrender/src/msvc32.mak	2008-02-19 08:22:12 UTC (rev 8534)
+++ branches/mtrender/src/msvc32.mak	2008-02-19 08:29:04 UTC (rev 8535)
@@ -300,7 +300,7 @@
 LARGEST_UINTEGER_TYPE=unsigned __int64
 GX_COLOR_INDEX_TYPE=$(LARGEST_UINTEGER_TYPE)
 
-CFLAGS=$(CFLAGS) /DGX_COLOR_INDEX_TYPE="$(GX_COLOR_INDEX_TYPE)"
+CFLAGS=$(CFLAGS) /DGX_COLOR_INDEX_TYPE="$(GX_COLOR_INDEX_TYPE)" $(MEMWRAP)
 !endif
 
 # -W3 generates too much noise.
@@ -624,6 +624,11 @@
 BAND_LIST_COMPRESSOR=zlib
 !endif
 
+# Choose whether or not to support rendering bands in multiple threads
+# to improve performance on multi-core systems. CLIST_THREADS=1 to enable.
+# default to single thread clist rendering by leaving the macro as ""
+### CLIST_THREADS=1
+
 # Choose the implementation of file I/O: 'stdio', 'fd', or 'both'.
 # See gs.mak and sfxfd.c for more details.
 
@@ -671,6 +676,10 @@
 
 # ---------------------------- End of options ---------------------------- #
 
+!if $(CLIST_THREADS) == 1
+CFLAGS=$(CFLAGS) /DUSE_LOCKING_MEMORY_WRAPPER
+!endif
+
 # Define the name of the makefile -- used in dependencies.
 
 MAKEFILE=$(PSSRCDIR)\msvc32.mak



More information about the gs-cvs mailing list