Rendering in Painterly Style

Specification

Many image manipulation systems can mimic painterly techniques by applying digital filters to the image data. Commonly found filters include coal, watercolor, oil.  Depending on the size (area) of the picture, the filters in use and their implementation the use of these filters is in general computationally very intensive and in most cases far from real-time. To render an object or a scene in a painterly style, we have to deal with three problems and find solutions that are implementable in an efficient real-time manner. These problems are:

In theory it would be possible to render an object directly in a painterly style, but since no established methods for this exist, we could not take advantage of modern accelerated rendering hardware.

Implementation

For demonstration purposes we will use the image shown in Figure 1. Even though it is not a rendered image but the digitised version of an actual photograph, we take the initial rendering for granted, as many established methods exist to create (realistic) imagery. Our concern lies with sampling/filtering the image and displaying the modified version of the original.


  Figure 1 - Demo picture. Choosen for the clarity in colours and presence of well
defined edges (sign) as well as fuzzy regions (tree). 
[All originals copyright Gunnar Schulz 1999]

To demonstrate the effects created by modern image manipulation software, we display some samples in Figure 2. The degree to which these filters analyse the original picture is unknown to the authors, but we assume that some filters may make a single pass over an image, while others detect edges, perform recursive refinements or other expensive operations. To create the pictures below with Photoshop 5.5 on a PIII system running Win98 on 65MB of RAM took roughly half a second per picture, which amount to two frames per second. The size of the pictures is small (218 x 218) and the complexity of the algorithms implementing the filters is proportional to the area of the images. As the area increases inversely proportional to the square of the distance of the object to the viewer, we can see that the computation of the filtering stage is highly non-linear and will result in an extreme performance loss when an object approaches the viewer. 

Paint Daubs effect Spatter effect Sponge effect

  Figure 2 - The original picture after different filters have been applied. 

This has in fact been observed by us after we implemented the following rendering scheme:

Several tricks and optimisations have been applied in the implementation of the above-mentioned scheme, which will be discussed in an extra section as they are generally applicable and non-specific to painterly rendering. The actual filter that we used in our implementation has been applied to pictures in Figure 3.

Oil filter applied once Oil filter applied twice

  Figure 3 - Oil filter algorithm with radius of 10 pixels applied once and twice respectively. 

To overcome the very severe real-time limitations of the first implementation, we designed and implemented our own filtering and re-painting technique. Here we make use of brushes (fundamentally luminance-textures of monochrome or grayscale values) which are painted over the scene. Various factors influence the re-rendering. These include:

The Brush Texture obviously has an influence on how strokes or dabs are applied to the painting. Several Textures can be used to obtain a heterogenous mixture of brush-shapes.

The number of brushes has various effects on the re-rendering. It determines the rendering-rate and has to be high enough to cover the painting-area while being small enough to allow fast rendering. In practice it should be proportional to the area of the original image to provide a constant resolution. Fewer brushes can be used if the original rendering is painted over (as opposed to re-rendering on a blank back-ground) because the brushes are blended with the back-ground (see Figure 4 for details).

The brush size has to be choosen in accordance with the number of brushes to cover the entire painting-area. During the rendering the size is changed to provide a greater variety in shape and size of brushes.

Brush Movement is random and roughly follows a brownian motion. This provides a dynamic appearance, mimicking the inconsistencies one would expect from a human painter aiming to produce an animation in a painterly style. We've tested our model with a totally random appearance, but found the resulting flickering too disturbing, therefore we added the brownian motion.

Sampling of the source-picture is determined by the position of the brushes. A single pixel's color value is chosen to determine the brush's color. This approach is simple, fast and effective. More elaborate approaches can be imagined and could be used to implement different painting-styles.

Rendering on blank Background Rendering over Original

  Figure 4 - Our Brush-Oil filter with 5000 Brushes on blank Background
and with 2000 Brushes rendered over the original picture.
[see Gallery for details]

Our latest approach is 100% supported by modern graphics accelerators. Frame-rates are interactive and real-time. In addition to this the great number of factors influencing the re-rendering allow for numerous of different painterly styles to be implemented. Possible additions with respect to this are:

It should be noted that most of these (and other additions) can be implemented without a significant loss in performance. Figure 5 shows a typical picture that has been processed with our Brush-Filters. Please Visit the Gallery for more examples.


  Figure 5 - a typical Re-rendering with our current Painterly-style system.

Solutions to implementation-specific problems

In order to implement our Painterly-style renderer (PSR), we solved several problems, most of which are not limited to PSR. Several of our techniques are very efficient and due to their generality can be applied to a diversity of other situations.

Minimising the Scanning Region

A na´ve approach to aquire the image to be processed would be to select the entire screen. Since we already determined that the complexity of a filter is in most cases at best proportional to the area of the image to be processed, we want to minimise this area as much as possible. This has to be done in a cheap and efficient manner, as we do not want to trade filtering-time for time spent on minimising the scanning region. We do this by projecting the bounding box of an object into screen-space and selecting the scanning region by evaluating the extrema of the projected vertices. This is shown in Figure 6. This approach is fast (projection of exactly 8 vertices, independent of the size or shape of the object) and in most cases a fairly good approximation of the screen area of an object while guaranteeing to contain all of its screen-pixels.


  Figure 6 - An object (star), its bounding box (green) and
the connected maxima of the projected bounding box (red) 

Projection of Bounding Box (BB)

Our next problem was logically linked to the previous one and dealt with quickly and efficiently projecting the vertices of the BB onto Screen-space. This was done using an innovative method which takes advantage of the capabilities of modern day graphics accelerators (providing T&L engine on-board). In general this projection can be achieved by a Matrix multiplication in 4D (homogenous space), provided we can obtain the correct multiplication Matrix. In other words the projected vertex p (vector) of another vertex v (vector) can be obtained with:

p = Mv    [Equation 1]

where M is the mult. Matrix. This itself might in practice be comprised of several other Matrices according to:

M = M1.M2.....Mn    [Equation 2]

Our method takes advantage of two facts. Firstly, the multiplication of four 4-row vectors vi with a 4x4 matrix M can be written as the Matrix multiplication of M with another Matrix V containing the row-vectors of vi. The results can then be found in the rows of the multiplication result:

   p1 = Mv1,   p2 = Mv2,   p3 = Mv3,   p4 = Mv4     is equivalant to   

Secondly, the Matrix M can easily be obtain in OpenGL with a series of commands indicating the current projection and model-view matrix. Combining these two facts allows us to do the following:

This method is shown in C++ code in Listing 1. The advantage of our method is that it takes advantage of the Hardware Matrix Multiplication capabilities of modern Graphics accelerators and is therefore very fast. In a multi-threaded system, we could thus use a graphics card as a secondary Floating Point unit, specialised in Matrix Multiplication (for a 4x4 Matrix this amounts to 16*4=64 multiplications and 16*3=48 additions, a total of 112 floating point operations).

Some problems with our method have been raised:

While we acknowledge the first point, we assume that as the hardware development progresses more and more graphics adapters will be equipped with T&L on-board engines. In fact NVIDIA is currently developing GPU (Graphics Processing Units) that are uniquely programmable by the user to perform customised lighting calculations and other effects.

The second point (while we have not seen tests on this or performed these ourselves) could also be a valid one. Again, the trend in current hardware development is towards faster and wider Graphics-busses. In addition to this, we could imagine our method being used in other applications, where the data transfer is minimised. For example the RGB-values of an image could be interpreted as co-ordinates in RGB-space. The image could then be loaded onto the Graphics card using very efficient vertex-arrays. Depending on the manufacturer and implementation these vertices can and are cached on the graphics card memory itself. Different matrices can then be applied to the image data to perform image manipulation tasks - an otherwise extremely processor intensive task.

Our point is that modern graphics accelerators are very powerful (and quite complete) micromachines, containing significant amounts of memory (currently between 16MB and 64MB), processing power (even if specialised) which could and should be utilised for other tasks than graphics processing.

Conversion between image-space and screen-space

By using normalised co-ordinates (ranging [0..1]) for our brush-positions, we can easily convert from image-space to screen-space, by simply multiplying the brush-co-ordinates by the dimensions of the image/screen to obtain image/scree-space co-ordinates. This de-couples the image and the screen and allows for indepedent dimensions.

Re-painting the scene

When re-painting the filtered scene, we have to make sure that the brushes face the viewer (ie the normal of the brush-area is anti-parallel to the view-vector). In other applications this is called Billboarding and can be done in several ways. If the current view-point was obtained by using the position and orientation of a virtual viewer, we can simply undo (reverse) the transformations responsible for the view-point when drawing the brushses. A second, more general approach is to save the current Matrix-stack, Reset it to the default rendering-context, Render the brushes and restore the Matrix-stack to its last state. For the sake of independence of the virtual viewer, we chose the second method for our implementation. The outline of this method is listed in Listing 2.

// allocate space for the Matrices
GLfloat modelViewMatrix[16],projectionViewMatrix[16],tempMatrix1[16],tempMatrix2[16] ; 
glGetFloatv(GL_PROJECTION_MATRIX,projectionViewMatrix);	// get projection View Matrix
glGetFloatv(GL_MODELVIEW_MATRIX,modelViewMatrix);	// get Modelview Matrix
glPushMatrix();		// save original matrix stack (as A)
// load the projectionmatrix and multiply with modelview to obtain final matrix mult
	glLoadMatrixf(projectionViewMatrix);
	glMultMatrixf(modelViewMatrix );
	// save this state (as B)
	glPushMatrix();
		glMultMatrixf(bbMatrix1); // load and multiply first 4 BB vertices
		glGetFloatv(GL_MODELVIEW_MATRIX,tempMatrix1); // get result in tempMatrix
	glPopMatrix();			// return to state B
	glMultMatrixf(bbMatrix2);	// load and multiply second set of 4 BB vertices
	glGetFloatv(GL_MODELVIEW_MATRIX,tempMatrix2);	// get result in tempMatrix2
glPopMatrix();		// restore to original state A

Listing 1 - Using OpenGL to perform parallel Vector-multiplication
(on T&L Hardware). Vecrots have been stored in bbMatrix1 and bbMatrix2
respectively

 

glMatrixMode(GL_PROJECTION);
glPushMatrix();			// save projection-view Matrix
	glLoadIdentity();	// Reset it to Unity
	glMatrixMode(GL_MODELVIEW);
	glPushMatrix();		// save ModelView Matrix
		glLoadIdentity(); // Reset it to Unity
	//----------------------
	//-- Do the Rendering --
	// Use 0 for z-Co-ordinate (e.g. glVertex2f())
	//----------------------
	glPopMatrix(); // restore modelviewmatrix
glMatrixMode(GL_PROJECTION);
glPopMatrix(); // restore projectionview matrix
glMatrixMode(GL_MODELVIEW);

Listing 2 - Billboarding - Rendering objects which always face
the viewer.

Note: The code given here may not be ideal or stright-forward, but has proven reliable and generic. If you have suggestions or comments for improvement, please don't hesitate to contact the authors.