Crowne Plaza Hotel, Seattle, USA
On-line Postscript file for download and printing
We describe a novel system "3DSketch" which demonstrates a two-handed 3D sketching paradigm for 3D modeling by casu ally digitizing an existing object. The conceptual model of the interface is based on the everyday experience in sketching with a pen on a piece of paper, but in our system, the user holds a 3D digitizing stylus as a 3D pen to sketch in 3D space. We call the pen a "smart pen", since in the Prototyper module, the user just sketches a few strokes on the object, and immediately sees a 3D prototype made of clay lumps; then, in the Refiner module, when the user adds more random strokes over the object, the prototype surface automatically adapts to follow the pen, and the surface features (edges and corners) align with the user specified ones, as if the pen tip applied magnetic attractive force to the prototype; in the autoTracer module, the user sketches over smooth regions, and the system performs intelli gent reasoning to infer smooth surfaces, and also extract dis continuity edges and corners from the user's inaccurate and fragmented strokes. The internal surface representation is trian gular splines (TriBezier, TriB, TriNURBS), whose advantages in arbitrary triangulation and local subdivision make it flexible to model general surfaces.
3D Authoring Tools, Hand Gestures, Multimodal Input, Human-Computer Interface
To tackle the above problems, we use a 3D input device for direct 3D input and manipulations; for the output, we currently use projective display with many depth cues such as floor mesh, shading, occlusion, etc. A stereo display device is pre ferred and will be used later. To make the modeling easier for a common user with no artistic or computer skills, we base our system on a conceptual model of freehand sketching. Our sys tem is novel in that the user only needs to draw unstructured strokes and the computer programs will infer structured repre sentations automatically, and thus the 3D modeling work becomes casual and easy.
Our "3DSketch" system can be used for 3D design, digitizing, and data visualization, but we currently concentrate on 3D modeling by digitizing from a real object. The system set-up is shown in Fig. 1, in which an SGI Indy workstation is used for computation and display, a mouse is used by left-hand for menu/icon clicking and 3D rotation, a 3D pen is held in the right hand, and a 2-button foot pedal is used to pause/continue/ stop the data sampling procedure.
Figure 1. System set-up.
The software has three major modules: Prototyper, Refiner, and autoTracer. With the Prototyper module, the user quickly builds a "clay" prototype by sketching a few strokes. With the Refiner module, the user locally improves the prototype sur face and aligns edge/corner features by adding more strokes. The autoTracer module automatically finds creases and cor ners from the stroke data over smooth regions, so that the user can avoid the tedious tracing of such features in the Refiner module. The automatically traced features are then fed into the Refiner module to improve the model. Fig. 2 shows an example.
Figure 2.Three modules of the system
From the programmer's view, the clay prototype modeling is implemented as iso-surfacing of a 3D scalar field. The local refinement is a procedure of deformable surface fitting and active edge alignment, using energy minimization in potential fields. The automatic tracing of discontinuity edges and cor ners are based on directional diffusion and topology analysis of a 3D tensor field.
We implemented the user interface using Xt/Motif and OpenGL. We do not use the 3D pen as a menu/icon selection device, since that will require extra motion of the right hand which does not always rest on the desk. The left hand is always resting on the desk, so hand fatigue is not a problem and the menu/icon selections are allocated to the left hand. Also we want to make use of the Motif utilities as much as possible, instead of writing very low level interactions from scratch using the 3D pen. However, we have made efforts to reduce the left hand operations by introducing gesture recogni tion in the system, so that some menu/icon selections are not really necessary.
A "virtual" pen is also displayed on the screen to mimic the actual motion of the 3D pen. The user can look at the pen to check the orientation of the coordinate system, the display mode, or to test lighting and rendering parameters on the pen for fast feedback. If the left button of the foot pedal is pressed, the data digitizing begins. If it is released, the digitizing pauses. Pushing the right button will finish a session of digitiz ing and back to viewing and modeling mode. If the pen stays static or moves very little, the data recording program will not update the data input, so the user can pause during sketching. A screen shot of the interface is attached in the end of the paper.
Based on this philosophy, we started to develop the 3DSketch system using a 3D digitizing pen. We call the system "3DSketch" since the strokes are sparse, only giving a "sketchy" description of the object. We demonstrate how to reduce the user's work by making the 3D pen more "intelli gent". The user would feel as if he/she had a "smart pen" that can understand his/her intent from the quick sketch and shows a regularized model, with degeneracy automatically detected and marked.
Since current computer modeling tools require precise struc tured operations, the tools block the user's flow of ideas and interfere with the user's concentration on creativity. Artists and designers still prefer to make freehand sketches on paper with a pen for quick feedback in visual thinking. Some researchers tried to convert such hand sketches to computer models. Lipson and Shpitalni [CAD1] implemented a system to reconstruct 3D models from freehand sketches. The problem with these systems is that the 3D models are inferred from a finished entire 2D sketch, so a lot of intermediate information, such as the orientation of a curve, the sequence between a few curves, the grouping of strokes for a part, and the outer bound ary of an object, is already lost. The inference system must perform segmentation, recognition, reconstruction, and repre sentation tasks to interpret the entire finished sketch, and these tasks are often coupled together, making the problem very hard to solve. To avoid such problems, an interactive system is preferred to perform incremental inference while the sketching is being done, and thus the intermediate information can be utilized. In our system, intermediate information is used and timely feedbacks are provided.
In other related work, 3D devices are also used to replace the 2D mice. Sachs et al. [4] described a "3-draw" system. The user holds a mirror-like plate in the left hand to rotate the whole screen display, and holds a 3D stylus in the right hand to sketch curves in 3D space. Deering [5] implemented a virtual environment called "holosketch". Despite the existence of so many 3D modeling software packages, we find that many pro ducers still prefer to start with clay models because clay is malleable and allows artists the full flexibility to manipulate surfaces in ways that are not easy to use or not available in off- the-shelf modeling softwares. Many methods and steps are involved in translating a clay model into a computer model. Our goal is to make the translation more efficient. Our system is special in that it allows random scratching over a region, so that rigorous curve tracing or structured patch meshing is not required. Such random scratching is much easier and faster to use than the conventional curve-based designing schemes. For the latter, the user cannot pause during sketching a curve, and must follow a restrictive sequence to specify a surface mesh, mostly a rectangular mesh which has artifacts in modeling irregular shapes and corners.
Many modeling activities, such as sketching and sculpting, involve both hands. According to Guiard's psychophysical research [3], the left and right hands often act as elements in a kinematic chain. For right-handed people, the left hand acts as the base link of the chain. The right hand's motions are based on this link, and the right hand finds its spatial references in the results of the motion of the left hand. Also, the right and left hands are involved in asymmetric temporal-spatial scales of motion: right hand for high frequency fine motion, left hand for low frequency coarse motion. Our 3DSketch system uses this natural division of manual labor by assigning the low-fre quency coarse setting of spatial context to the left hand, and the high-frequency fine selection and manipulation operations to the right hand. The left hand provides context by doing menu/icon selection, rotating the global scene, and providing constraint mode. The right hand operates in the frame of refer ence set up by the left hand, and operates after the left hand's selection of a command and picking of a tool. The right hand performs finer sampling, sketching and manipulation of the surface geometry.
Although the 3D pen can also be used for designing surfaces from imagination or fitting surfaces to real data such as CT/ MRI medical data, we are now just concentrating on modeling by digitizing an existing object. Some subtle differences may exist between these applications. For example, a sphere can be designed by clicking at the center then stretching an initial sphere to a desired radius. But in digitizing, the probe can never move inside a solid object to specify a center. For digi tizing, one display view is enough, but for modeling and visu alization of real data, multiple display views may be necessary for 3D alignment.
To digitize curved, especially organic shapes, current systems require that the user trace very dense and regular meshes on the surface. Such work is tedious and time-consuming (8 hours are reportedly needed for an experienced modeler to manually complete a surface mesh by digitizing a toy cat). Our system allows the user to casually specify the most important features and let the computer to fill in the regular gaps.
Textbooks on sketching teach the students to first box-up the objects to draw, outline dimensions and proportions, then refine the details. The simplest example is to sketch an ellipse. First draw a horizontal axis and a vertical axis, then draw a rectangle box, see Fig. 3(a), and small arc segments are drawn at the top, bottom, left and right locations to provide tangent constraints. From this incomplete sketch, we can already per ceive a good ellipse (our eyes or brains fill the gaps based on perceptual experience). To finish the sketch, more strokes inside these key points and segments are filled in, and some local modifications may also be made, see (b).
These techniques provide guidelines for our system design. In our system, the user first makes a very crude prototype (a blob, a plate, a stick, etc.); then adds more strokes to specify surface details, edges, and corners, and the crude surface will automat ically deform to reach the sketch strokes and the surface edges and corners will also move to align with the specified edges and corners. Since it is difficult to trace the pen along ridge edges on objects (the pen will often slip off the object), we provide an automatic inference module which can infer edges and corners from the user's sketching strokes over the nearby smooth regions, and thus the user is free from the edge tracing tasks. The automatic inference module is based on our group's previous research on perceptual grouping [18, Vis96].
In next Section, the principles and the implementation details are given. The work is summarized in Section 5 and the future directions are discussed in Section 6.
An object is either a simple entity or an assembly of several parts, and a part is approximately a blob, a stick, or a plate, and each part can be globally deformed to produce a more general part. To digitize a human head, a single sphere prototype can be produced by just clicking 4 points on the head, then the sphere can be globally tapered, and then locally deformed in the refinement module. To digitize a hand, we can specify the palm, the thumb, and the four fingers separately then glue them together to give a single triangular mesh for later local refinement. A finger may also be prototyped as a cylinder, or as a few shorter jointed cylinders, or as a few glued blobs. Compared with surface patch method, volumetric clay model ing is more efficient and there are no gaps or holes in the parts or the objects, and it does not require Boolean operations in the composition.
Gesture recognition is used for fast modeling without clicking the menus or icons. The recognition rules are as follows: if more than three strokes are sketched, an ellipsoid is generated; if only two strokes, the closed (or almost closed) stroke is taken as cross-section and classified into a square or an ellipse in 3D space, and the other open stroke is used as the sweeping axis for the cross-section. If both strokes are closed, a negative part is recognized, i.e., a hole is produced. Of course, the user can always override the recognized properties by manually clicking the icons; for example, the user can sketch a closed curve as the cross-section of a hole, and then sweep along the hole to specify an axis, then changes the default "positive" property to "negative". Fig. 4 shows a prototyping example.
Figure 4. Clay prototyping by hand gestures
To further reduce the menu/icon clicking, we plan to imple ment more gestures for delete, undo, redo, etc., but the menu/ icon options are still available.
Figure 4. Clay prototyping by hand gestures
The energy to be minimized is defined as follows:
E = Esmooth + Esurface + Eedge + Ecorner
At first, the surface is a quadratic Bezier spline surface which is at least C0 by sharing control points between adjacent trian gles, but we still need the smoothness energy to minimize the mesh roughness. The smoothness energy is defined in terms of the first-order derivatives [Vis96]. The surface energy is for surface fitting. Minimizing the crease/corner energy aligns the creases/ corners and significantly reduces the total energy, or coopera tively improve the surface fitting precision, and we thus do not have to subdivide the triangles into many tiny ones to obtain a good fitting if misaligned edges and corners are present. After the TriBezier surface is refined, creases and corners are marked and the model is upgraded to C1 triangular B-spline (TriB) surface with preserved edges/corners (TriBezier is a special case of TriB), one more time of energy minimization is performed to obtain the final surface [Vis96].
We use the Levenberg-Marquardt algorithm with numerically estimated gradients of the energy and take the steepest descent direction in iterations (in physics, the gradient of a potential field is a force field). To maintain interactive speed, the active triangles should be kept as few as possible, so the user must move the pen locally, instead of traversing a long stroke over the object. After a region is refined, the pen can be moved to another local region.
We have recently found that vector potential fields provide bet ter convergence than the scalar potential fields, at the cost of more storage and computation. The reason is that some posi tion may receive conflicting forces from nearby data points. Using vector potential field with radial directions for each data point will cancel out the conflicting forces in the accumulated potential field, making it possible to attract the surface toward correct directions more efficiently.
To be as general as possible, we only use the sampled unstruc tured points and diffuse them into a stream surface. This is equivalent to diffusing rain droplets on a car's hood into a shallow water stream surface and thus obtaining the surface model of the hood. Since we are using a 3D "pen", we'd like to call each data point an inkspot (similar to "streamball" in [13]). To simplify the principles, we first explain the diffusion procedures in 2D, that is, to diffuse inkspots on a sheet of paper into streamlines and detect the corners as well. If the user sketches slowly, the streamlines are readily available by simply connecting sequential inkspots and fitting them with a B-spline curve, and then we can skip the inkspot diffusion step and directly diffuse the streamlines into stream surfaces. Details on the algorithms can be found in [18,Vis96].
Fig. 7 (a) shows some inkspots with added noise for testing. In (b), an inkspot is shown. The mass density is the largest and the grayscale the darkest (black) at the inkspot position. Then as an inkspot diffuses, its mass density decays with distance and the ink darkness decays also. For a position on the paper, since the ink reaches the position from a specific direction, physical measures such as mass density, velocity, and kinetic energy density are all vectors. For each position, all nearby inkspots can diffuse and reach this position with different mass density along different directions. That means that each posi tion accumulates many mass density vectors, as depicted in (c), and thus each position contains a tensor and the paper becomes a dense 2D tensor field. This is similar to the stress tensor field inside a solid, where each position receives stress forces of different strength and directions from all nearby posi tions [15,16,17].
Mathematically, we compute at each position a 2x2 covariance matrix (or called second-order moments, or scatter matrix) of all accumulated vectors, and the two eigenvectors defining the major and minor axes of an ellipse, as seen in (c). A large thin ellipse indicates an ink stroke passing through this position, and the major eigenvector gives the tangent direction of the stroke; by contrast, a large round ellipse have two salient directions, indicating an intersection of two or more strokes. More specifically, the major eigenvalue yields an absolute measure of ink mass density, and the ratio (eccentricity) tells whether this position is a stroke curve position or an corner/ intersection position, whereas small eigenvalues indicate posi tions far from the inkspots.
After the above diffusion, we obtain an estimated stroke direc tion (curve tangent) at each position, and we can further improve the diffusion result by a second pass of diffusion, but this time with an oriented diffusion: the mass density decays with distance and also decays with the offset angle from the tangent. If the tangent direction is very certain (the ellipse is very thin), a linear diffusion pattern in Fig. 8(a) is used. If the tangent direction is not very certain (round ellipse), the stroke at this position is either very curved or at a corner/intersection, so a curved diffusion pattern is used, see Fig. 8(b).
Fig. 9 (a) is the mass density after the second pass diffusion, and (b) is the stroke curves traced by searching for locally darkest positions. The intersection positions (degenerate points) are also marked.
In 3D space, inkspots or ink streamlines diffuse to yield a 3D tensor field. At each position the tensor is represented by three eigenvectors which depict an ellipsoid. If the ellipsoid is stick- like, the position belongs to a surface since there exists a unique major normal direction; if it is plate-like, this position is along an edge since there are two major normal directions; for a blob, this position is a corner due to the three conflicting stream surface normal directions. See Fig. 10 for an intuitive illustration.
Fig. 11 (a) show 3D strokes sketched over the smooth regions of a wood part. Note that we do not trace any edges or corners. The AutoTracer module infers potential fields for surfaces, edges and corners. A sphere is given by the Prototyper mod ule, and the Refiner module is called to deform the C0 TriBez ier prototype and align the edges and corners. Finally the edges and corners are automatically marked, and the TriBezier surface is upgraded to TriB surface which is C1 everywhere except along the edges and at the corners, then the Refiner module is called once more for the TriB surface [Vis96].
Figure 11. AutoTracer on a piece of wood part
Technically, the prototyping module uses equipotential sur faces of a scalar field to produce clay-like models; for the local refinement module, scalar or vector potential fields are used to generate attractive forces to deform the shape; for the auto matic inference module, tensor fields are used to find surfaces and features (creases and corners).
For the user interaction, we now use a 2D mouse for menu selection and 3D rotation. We have recently ordered a 6- degree-of-freedom space ball, which will provide more ease of use for the left hand operations. Some company has attached a laser beam and sensor at the pen tip, so that the object will not be damaged by the touching of the sharp pen tip during sketch ing. More hand gesture recognition techniques will be applied to further reduce the time necessary for looking at the screens for menus and icons. With a see-through head mounted dis play, the user could directly see the 3D model imposed on the real object, and find the regions where more refining strokes are needed, and a translucent display which does not block the user's view may be helpful.
[10] M. Eck and H. Hoppe, Automatic Reconstruction of B-
Spline Surfaces of Arbitrary Topological Types, SIG
GRAPH'96, pp. 325-334, Aug. 1996
[11] V. Krishnamurthy and M. Levoy, Fitting Smooth Surfaces
to Dense Polygon Meshes, SIGGRAPH'96, pp. 313-324,
Aug. 1996
[12] L. Forssel, Visualizing Flow Over Curvilinear Grid Sur
faces Using Line Integral Convolution, IEEE Vis'94, pp.
240-247, Washington D.C., Oct. 1994
[13] M. Brill, H. Hagen, H.-C. Rodrian, W. Djatschin, and S.
Klimenko, Streamball Techniques for Flow Visualiza
tion, IEEE Vis'94, pp. 225-231, Oct. 1994
[14] A. Parkin, Some Problem of Singular Points and Bound
ary-conditions in the Sketching of Flow Nets, Geotech
nique, vol. 44, no. 3, Sep. 1994, pp. 513-518
[15] T. Delmarcelle and L. Hesselink, Visualizing Second
Order Tensor Fields with Hyperstreamlines, IEEE
CG&A, July 1993, Vol. 13, No 4, pp. 25-33
[16] T. Delmarcelle and L. Hesselink, The Topology of Sym
metric Second-order Tensor Fields, IEEE Vis`94, Octo
ber 17-21, 1994, Washington, D.C., pp. 140-147
[17] L. Hesselink, Y. Levy, and Y. Lavin, The Topology of
Symmetric, Second-Order 3D Tensor Fields, IEEE
Trans. Vis&CG, pp. 1-11, Vol. 3, No. 1, Jan.-Mar. 1997
[18] G. Guy and G. Medioni, Inferring Global Perceptual Con
tours from Local Features, Int'l Journal of Computer
Vision, vol. 20, no. 1/2, pp. 113-133, 1996