Table of Contents
MLSGPU is a tool for reconstructing triangle meshes from point clouds obtained via laser range scanning (or potentially other methods). It is able to take advantage of GPUs for high performance, and can handle hundreds of gigabytes of input and output.
MLSGPU is only one step in a scanning pipeline. Acquistion, cleaning, registration and feature size estimation all need to occur before MLSGPU is used. Refer to Section 3.1, “Input files” and Section 3.2, “Output files” for further details on the input and output data formats.
Table of Contents
MLSGPU requires either Microsoft Windows, or a POSIX operating system such as a GNU/Linux system. At present only Ubuntu 12.04 is tested, but other variants are expected to work. It is also highly recommended that you use a 64-bit operating system. It should still be possible to use a 32-bit OS, but it is untested and there may be problems with large data sets.
MLSGPU depends on the following software to compile and run. Versions listed are the ones that have been tested; older or newer versions will often work too.
A C++ compiler. GCC 4.6 and 4.7, Clang 3.0 and MSVC 2010 have been tested. Note that at the time of writing, Clang does not support OpenMP and so performance will be reduced.
Boost 1.48, including the following runtime libraries:
boost_program_options
boost_iostreams
boost_thread
boost_filesystem
boost_system
boost_math_c99
boost_math_c99f
boost_serialization
clogs 1.1
Doxygen 1.7.4
Python 2.7
xsltproc 1.1
DocBook 4.3 stylesheets
An implementation of OpenCL 1.1. GPU device drivers will normally include this. It has been tested with NVIDIA GPU drivers and with the AMD APP SDK 2.7 on a CPU. The device must support images.
The following tools and libraries are necessarily to build optional parts, but are not required:
CppUnit 1.12 is needed to build the test suite.
The following list of packages should suffice on Ubuntu 12.04 (although it has
not been tested against a clean installation), with the
exception of clogs
which has not been
packaged for Ubuntu. When configuring clogs
you can pass
--cl-headers=
to get the OpenCL header files.
MLSGPU_ROOT
/khronos_headers
xsltproc docbook-xml docbook-xsl libboost-dev libboost-iostreams-dev libboost-filesystem-dev libboost-system-dev libboost-math-dev libboost-program-options-dev libboost-thread-dev libcppunit-dev g++ libgl1-mesa-dev
Before actually compiling, the build must be configured. This can be
done by running python waf configure
. This
will check that the required libraries are present. If
configuration fails, you can find more detailed error
information in build/config.log
. The
build system will attempt to auto-detect the compiler, but if
you wish to override it you can set the CXX
environment variable before doing the configuration.
The installation directories are chosen at configure
time. The default is to install files into subdirectories of
/usr/local
, but this
can be overridden with
--prefix=
.
PREFIX
There are also other command-line options that can be given to
affect the configured build. They are intended mainly for
developer rather than end-user use, so they are not documented
here. Running python waf configure --help
will show a full list.
Once configuration is complete, running
python waf
will perform the compilation.
Table of Contents
The input format for MLSGPU is the PLY file format. Additionally, it is restricted to a subset of valid PLY files:
Only binary files are supported, and only in the endianness used by the host CPU (typically little-endian for an x86 or x86-64 CPU).
The first type of element in the file must be
vertex
. Other elements may be
present but they must occur later in the file, and
will be ignored.
The vertex
element must
contain the fields
x
,
y
,
z
,
nx
,
ny
,
nz
and radius
(explained below), and
they must all have type float32.
Other fields may be present as long as they are not
lists, and they will be ignored.
The positions are given by x
,
y
and z
. The units are
arbitrarily, but they must of course match across all input
files. The oriented normals are given by
nx
, ny
and
nz
, and they must have unit length. The
final required field is radius
, which is an
estimate of the spacing between the sample and its neighbors.
This must be positive and use the same units as the position.
For best performance, the order of input samples in a file should correlate well with position. Simply outputting the points as they are encountered in a regular sampling grid will give good results. In particular, do not sort the points along a single axis, as this will reduce coherence.
MLSGPU accepts multiple input files. The files must already have been registered and transformed into a common coordinate system.
The output format for MLSGPU is again the PLY file format. The output file will contain just vertex positions and triangles; all other metadata from the input is discarded. MLSGPU can either write the entire output mesh to a single PLY file, or break the volume up into a regular grid and output a separate PLY file for each non-empty grid cell. In the latter case, the vertices at the boundaries between files will be duplicated in both files, so that neighboring files can be loaded together to give a seamless join.
The minimum command-line for running MLSGPU is
mlsgpu
--fit-grid=spacing
-o output.ply
input.ply
...
The spacing
specifies the spacing
between sample points in a regular grid that will be used in
the Marching Tetrahedra algorithm. All vertices in the output
file will be on edges of this grid. This value should be of a
similar order of magnitude to the finest scanning density.
Using too large a value will not only cause the reconstruction
to look blocky, but will also lead to unexpected holes. Using
too small a value will lead to an excessively large output
file, and will also significantly increase the running time.
Multiple input files may be listed on the command line. You
may also list a directory on the command line, in which case
all .ply
files in that
directory will be loaded (but without recursing into
subdirectories).
The following subsections document the options that are
intended for general use. There are additional options that
are only intended for use by the developers of MLSGPU, and
which are not documented. You can see a full list of options
by running mlsgpu
, which also shows the
default values used.
--help
To handle the large datasets, the output mesh is first written
to temporary files before being reorganised for the final
output files. The temporary files will take roughly the same
amount of space (sometimes around 20% more) as the final
output files, so you will need to ensure you have sufficient
free space. Use --tmp-dir
to store the temporary files in
path
path
. If this option is not
specified, the default path for the operating system is used.
The temporary files are deleted at the end of a successful run, but if the program crashes or is killed, the temporary files will remain on disk and need to be manually removed to recover the space.
Operating systems sometimes place a limit on the length of a
command-line, which can be difficult if there are a very large
number of input files (although the option to specify a
directory instead of a file is usually sufficient). To
work around this, a
response file can be used to place the
command-line arguments in a file. First create a file with the
command-line arguments. The arguments can be separated by
whitespace or placed on separate lines. Then pass
--response-file
when running
MLSGPU. It is possible to place some arguments in the response
file and others on the command line, but only one response
file is supported. The response-file processor is also extremely
basic: spaces in filenames will cause problems, and shell
wildcards will not work.
filename
Rather than producing a single giant output file, it is
possible to split the output into chunks by passing
--split
on the command line. The chunks form
a regular grid and each chunk is named
,
where basename
_XXXX
_YYYY
_ZZZZ
.plyXXXX
, YYYY
and
ZZZZ
are the positions within
this grid and basename
is the
argument to -o
. Note that for this usage,
the argument to -o
should be just a
prefix and not a full filename.
Only output files that contain at least one triangle are written. If you are experimenting with different parameters, it is strongly recommended that you delete all the outputs from previous runs with the same basename before starting, as if the corresponding file is not written in the current run then the old file will be mixed in with the other newly written files.
The spatial size of the chunks is chosen automatically
using heuristics that attempt to keep the size of each
file manageable, but since it is impossible to determine
the sizes of the output files in advance, the heuristic
may need to be adjusted if the output files are too big or
too small. This can be done by passing
--split-size=
,
where size
size
is a target size.
Use --help
to see
what the default value is and then adjust accordingly.
You can use a suffix of K
,
M
or G
to specify
kibibytes, mebibytes or gibibytes respectively.
By default, MLSGPU will run on all GPU devices it finds in
the system. This is often the desired result, but in some
cases it may be desirable to use extra devices or restrict
the set of devices used. In particular, when there are no
OpenCL-capable GPUs in the system, it will usually be
necessary to pass --cl-cpu
.
There are three command-line options that control device
selection: --cl-cpu
,
--cl-gpu
and
--cl-device
. The effects are additive,
i.e., any device that matches any of the command-line
selectors will be used. The --cl-cpu
and
--cl-gpu
options take no arguments, and
simply enable all CPU or GPU devices.
The --cl-device
option can be used in two
ways: firstly,
--cl-device=
will enable all devices whose device name begins with
prefix
prefix
. The device name is
determined by the OpenCL API; a tool like
clinfo from the AMD APP SDK is useful
to discover the names of the devices in the system.
Secondly,
--cl-device=
will enable just the prefix
:n
n
th device
(zero-based) whose name starts with
prefix
. This is mainly useful
if there are several identical devices in the system.
As an example, passing
--cl-cpu
--cl-device=Intel
--cl-device=GeForce:0
will enable all CPU
devices, all devices whose name begins with
Intel
and the first device whose name
begins with GeForce
.
When mixing devices that are not identical, differences in floating-point computation can cause variations at the join between blocks. This can lead to cracks in the reconstructed mesh, and in extreme cases the mesh may even become non-manifold. For final production always use only identical devices.
When MLSGPU starts, it will report which devices it is using.
The MLS reconstruction is essentially a process to smooth
the noisy sampling process. The degree of smoothing can be
controlled with --fit-smooth
. Increasing
the smoothing value will reduce noise, but may also smooth
out detail. As a side effect of the implementation,
increasing smoothing will also allow small holes to filled
in that would not have been filled at lower smoothing
levels. The running time scales roughly with the square of
the smoothing factor, so using too much smoothing can also
make MLSGPU very slow.
The underlying reconstruction algorithm tends to create
spurious pieces of geometry that are disconnected from the
rest of the model, so as a final step any small connected
components are discarded. Usually this will just do the
right thing, but if the scans actually capture some small
feature that is disconnected from the rest of the scanned
data, it may accidentally be discarded. In this case, the
threshhold for discarding a component (as a fraction of
the total number of output vertices) may be specified with
--fit-prune
.
MLSGPU explicitly detects boundaries in the provided point
cloud. It tries to avoid extrapolating beyond these
boundaries, as these extrapolations tend to have very poor
quality. However, the heuristic is not perfect, and tends
to both cause unwanted small holes in the reconstruction
and to extrapolate in some areas it should not. The
default tries to balance the two, but the user can
override the threshhold using
--fit-boundary-limit
. Increasing the
value will cause more extrapolation, while decreasing it
will reduce extrapolation but potentially open more holes.
However, increasing the value beyond about 1.7 will have no
further effect.
There are a number of limitations to the amount and type of input and output that MLSGPU can handle:
Only certain types of input files can be used. See Section 3.1, “Input files” for details.
Up to 223 (about 8 million) input files. Note that when using large numbers of input files, you will probably need to either pass a directory on the command line, or use response files to work around limits on the length of the command line.
Up to 240 (about 1.1 trillion) points per input file.
Up to 232-1 (about four billion) vertices per output file (this is a limitation of the PLY file format).
The total size of the model can be at most 220 (about one million) times the grid spacing. For example, a model with a side length of 1 kilometre cannot be reconstructed at finer than 1mm.
Two runs of MLSGPU will generally not produce exactly the same stream of bytes, even with identical arguments. However, the only difference should be the order in which the vertices and triangles appear in the files, and the geometry should be identical.
MLSGPU can be used on a cluster to distribute processing to more GPUs than will fit in a single box. It scales reasonably well to 8 GPUs, but beyond this point it is likely that the master node will become a bottleneck as some operations are not parallelized.
To use MLSGPU on a cluster, you will need an MPI implementation while supports MPI-IO. We have only tested with OpenMPI 1.6 on Linux, and in fact older versions of OpenMPI have known bugs. MPI is automatically detected when running python waf configure. The resulting binary is called mlsgpu-mpi, and the interface is essentially the same as for mlsgpu.
Most data movement is handled through the filesystem. It is thus beneficial to have a high performance parallel filesystem that integrates with MPI-IO. We have had good results with GPFS, but other filesystems will probably work fine too. NFS does not work very well, because it requires a lot of locking to guarantee the necessary semantics for safe parallel access. Note that the temporary directory must be on a filesystem that is shared between the nodes, not a local scratch area.
MLSGPU is designed to run with one process per node and to use
multiple threads, rather than running one per CPU core. If you are
using OpenMPI, then you should pass -pernode
to mpirun. MLSGPU will fire up a number of
threads for managing I/O and GPUs, and more under the control of
OpenMP (the number can be overridden by passing
--omp-threads
to
mlsgpu-mpi). If you are using a scheduling
system on the cluster it is best to ask to reserve entire nodes,
but if not it is up to you to ensure that MLSGPU does not consume
more CPU cores than you have reserved.
5.1. | The configuration said that a header file was not found, but I know it exists. |
The error indicates that compilation using that header
file failed, but this can happen for other reasons than
the header file being absent. Look through
| |
5.2. | Meshlab crashes when I try to open one of the output files. |
Meshlab is unable to process long comments. Try deleting the comments from the output file. On a UNIX system you can to do this by running
| |
5.3. | Every time I run the program I get different output files, even though I use the same options. |
This is normal behavior. The geometry is (or should be) the same every time. Only the order of the vertices and triangles change. | |
5.4. | MLSGPU is using too much CPU memory. |
Run | |
Check whether | |
5.5. | I am getting errors about too much memory being used for an OpenCL device. |
Firstly try reducing the value of
| |
5.6. |
I get the error |
This usually indicates that the value of
| |
5.7. |
I get almost no output, or I get the message
|
This usually indicates that the value of
| |
5.8. | The output model contains lots of tiny holes in a regular pattern. |
This is usually caused by the value of
| |
5.9. | There are some small holes in the output that I would like to fill. |
Increasing | |
5.10. | My scans consisted of several unconnected pieces and one of them does not appear in the output. |
See Section 3.3.6, “Component pruning” for an
explanation of the |
MLSGPU is no longer being actively developed. If you find a bug or need a new feature, your best option is to fix or implement it yourself and send me a GitHub pull request.
Table of Contents
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.