- 1 Usage
- 2 M5 Options
- 3 Script Options
- 4 Full System (FS) Mode
- 5 Checkpoints
- 6 Switchover/Fastforwarding
- 7 Mutiprogrammed Workloads
The M5 command line has four parts, the M5 binary, options for the binary, a simulation script, and options for the script. The options that are passed to the M5 binary and those passed to the script are handled separately, so be sure any options you use are being passed to the right component.
% <m5 binary> [m5 options] <simulation script> [script options]
Running m5 with the "-h" flag prints a help message that includes all of the supported simulator options. Here's a snippet:
% build/ALPHA_SE/m5.debug -h Usage ===== m5.debug [m5 options] script.py [script options] Copyright (c) 2001-2008 The Regents of The University of Michigan All Rights Reserved Options ======= --version show program's version number and exit --help, -h show this help message and exit --authors, -A Show author information --build-info, -B Show build information --copyright, -C Show full copyright information --readme, -R Show the readme --outdir=DIR, -d DIR Set the output directory to DIR [Default: m5out] --redirect-stdout, -r Redirect stdout (& stderr, without -e) to file --redirect-stderr, -e Redirect stderr to file --stdout-file=FILE Filename for -r redirection [Default: simout] --stderr-file=FILE Filename for -e redirection [Default: simerr] --interactive, -i Invoke the interactive interpreter after running the script --pdb Invoke the python debugger before running the script --path=PATH[:PATH], -p PATH[:PATH] Prepend PATH to the system path when invoking the script --quiet, -q Reduce verbosity --verbose, -v Increase verbosity ...
The script section of the command line begins with a path to your script file and includes any options that you'd like to pass to that script. Most Example scripts allow you to pass a '-h' or '--help' flag to the script to see script specific options. An example is as follows:
M5 compiled Apr 2 2011 00:57:11 M5 started Apr 3 2011 21:16:02 M5 executing on zooks command line: build/ALPHA_SE/m5.opt configs/example/se.py -h Usage: se.py [options] Options: -h, --help show this help message and exit -c CMD, --cmd=CMD The binary to run in syscall emulation mode. -o OPTIONS, --options=OPTIONS The options to pass to the binary, use " " around the entire string -i INPUT, --input=INPUT Read stdin from a file. --output=OUTPUT Redirect stdout to a file. --errout=ERROUT Redirect stderr to a file. --ruby -d, --detailed -t, --timing --inorder -n NUM_CPUS, --num-cpus=NUM_CPUS --caches --l2cache --fastmem --clock=CLOCK --num-dirs=NUM_DIRS --num-l2caches=NUM_L2CACHES --num-l3caches=NUM_L3CACHES --l1d_size=L1D_SIZE --l1i_size=L1I_SIZE --l2_size=L2_SIZE --l3_size=L3_SIZE --l1d_assoc=L1D_ASSOC --l1i_assoc=L1I_ASSOC --l2_assoc=L2_ASSOC --l3_assoc=L3_ASSOC ...
To run SPEC 2000 binaries on m5 you should use the cpu2000.py configuration script.
Input sets and Binaries
Several of the cpu2000 benchmarks for our regression tests. Unfortunately because of licensing restrictions we can't provide the binaries or input files, however to make this a bit easier we have created cpu2000.py. Currently the script is tailored to our particular organization of the binaries and input files. To make the python work for you you'll minimally have to change
spec_dist to point to wherever you keep your cpu2000 binaries/input sets. We have our binaries and input sets organized in the following directory structure:
Where ARCH is alpha or sparc, OPSYS is linux or tru64, BENCHMARK is the name of the spec binary (e.g gzip), INPUTSET is the input files (e.g. smred), and FILES are the specific input files. If you can't create this structure you'll have to mess with cpu2000.py to change how it finds files.
How to use it
The cpu2000.py configuration file takes this data and creates an m5 workload parameter based on benchmark name, isa, operating system, and input set. If you take a look at tests/long/00.gzip/test.py you can see an example of this, but in brief:
from cpu2000 import gzip_log workload = gzip_log('alpha', 'tru64', 'smred') root.system.cpu.workload = workload.makeLiveProcess()
Assuming you have a machine configured normally above that blob would correctly run the gzip log spec2000 benchmark for alpha/tru64 with the smred input set.
SPEC2K Command Lines (Syscall Emulation)
If you would like to run SPEC2K using syscall-emulation mode, a good reference for the correct command line options can be found here:
(Note: these example command lines aren't for reduced/minimized input sets.)
Then, if you are unable to use some of the aforementioned cpu2000.py scripts, you could try something like this to run statically-linked, ALPHA, eon binary:
$ build/ALPHA_SE/m5.debug configs/example/se.py --cmd=eon00 --options="chair.control.cook chair.camera chair.surfaces chair.cook.ppm ppm pixels_out.cook"
SPEC 2006 (spec2k6)
We need to get bits of info from SPEC2006_benchmarks and commit some of the code mentioned there.
Full System (FS) Mode
Full System Files
We'll assume that you've already built an ALPHA_FS version of the M5 simulator, and downloaded and installed the full-system binary and disk image files. Then you can just run the fs.py configuration file in the m5/configs/examples directory. For example:
% build/ALPHA_FS/m5.debug -d /tmp/output configs/example/fs.py M5 Simulator System Copyright (c) 2001-2006 The Regents of The University of Michigan All Rights Reserved M5 compiled Aug 16 2006 18:51:57 M5 started Wed Aug 16 21:53:38 2006 M5 executing on zeep command line: ./build/ALPHA_FS/m5.debug configs/example/fs.py 0: system.tsunami.io.rtc: Real-time clock set to Sun Jan 1 00:00:00 2006 Listening for console connection on port 3456 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 warn: Entering event queue @ 0. Starting simulation... <...simulation continues...>
By default, the fs.py script boots Linux and starts a shell on the system console. To keep console traffic separate from simulator input and output, this simulated console is associated with a TCP port. To interact with the console, you must connect to the port using a program such as
telnet, for example:
% telnet localhost 3456
Telnet's echo behavior doesn't work well with m5, so if you are using the console regularly, you probably want to use M5term instead of telnet. By default m5 will try to use port 3456, as in the example above. However, if that port is already in use, it will increment the port number until it finds a free one. The actual port number used is printed in the m5 output.
In addition to loading a Linux kernel, M5 mounts one or more disk images for its filesystems. At least one disk image must be mounted as the root filesystem. Any application binaries that you want to run must be present on these disk images. To begin running benchmarks without requiring an interactive shell session, M5 can load .rcS files that replace the normal Linux boot scripts to directly execute from after booting the OS. These .rcS files can be used to configure ethernet interfaces, execute special m5 instructions, or begin executing a binary on the disk image. The pointers for the linux binary, disk images, and .rcS files are all set in the simulation script. (To see how these files work, see Simulation Scripts Explained.) Examples: Going into / of root filesystem and typing ls will show:
benchmarks etc lib mnt sbin usr bin floppy lost+found modules sys var dev home man proc tmp z
Snippet of an .rcS file:
echo -n "setting up network..." /sbin/ifconfig eth0 192.168.0.10 txqueuelen 1000 /sbin/ifconfig lo 127.0.0.1 echo -n "running surge client..." /bin/bash -c "cd /benchmarks/surge && ./Surge 2 100 1 192.168.0.1 5. echo -n "halting machine" m5 exit
The m5term program allows the user to connect to the simulated console interface that full-system m5 provides. Simply change into the util/term directory and build m5term:
% cd m5/util/term % make gcc -o m5term term.c % make install sudo install -o root -m 555 m5term /usr/local/bin
The usage of m5term is:
./m5term <host> <port> <host> is the host that is running m5 <port> is the console port to connect to. m5 defaults to using port 3456, but if the port is used, it will try the next higher port until it finds one available. If there are multiple systems running within one simulation, there will be a console for each one. (The first system's console will be on 3456 and the second on 3457 for example) m5term uses '~' as an escape character. If you enter the escape character followed by a '.', the m5term program will exit.
m5term can be used to interactively work with the simulator, though users must often set various terminal settings to get things to work
A slightly shortened example of m5term in action:
% m5term localhost 3456 ==== m5 slave console: Console 0 ==== M5 console Got Configuration 127 memsize 8000000 pages 4000 First free page after ROM 0xFFFFFC0000018000 HWRPB 0xFFFFFC0000018000 l1pt 0xFFFFFC0000040000 l2pt 0xFFFFFC0000042000 l3pt_rpb 0xFFFFFC0000044000 l3pt_kernel 0xFFFFFC0000048000 l2reserv 0xFFFFFC0000046000 CPU Clock at 2000 MHz IntrClockFrequency=1024 Booting with 1 processor(s) ... ... VFS: Mounted root (ext2 filesystem) readonly. Freeing unused kernel memory: 480k freed init started: BusyBox v1.00-rc2 (2004.11.18-16:22+0000) multi-call binary PTXdist-0.7.0 (2004-11-18T11:23:40-0500) mounting filesystems... EXT2-fs warning: checktime reached, running e2fsck is recommended loading script... Script from M5 readfile is empty, starting bash shell... # ls benchmarks etc lib mnt sbin usr bin floppy lost+found modules sys var dev home man proc tmp z #
Full System Benchmarks
We have several full-system benchmarks already up and running. The binaries are available in the disk images you can obtain/download from us, and the .rcS files are in the m5/configs/boot/ directory. To run any of them, you merely need to set the benchmark option to the name of the test you want to run. For example:
%./build/ALPHA_FS/m5.opt configs/example/fs.py -b NetperfMaerts
To see a comprehensive list of all benchmarks available:
%./build/ALPHA_FS/m5.opt configs/examples/fs.py -h
First of all, you need to create a checkpoint. After booting the M5 simulator, execute the following command (in the shell):
which will create a new directory with the checkpoint, named 'cpt.TICKNUMBER'
With the new simulator (2.0 beta2), the restoring from a checkpoint can usually be easily done from the command line, e.g.:
build/ALPHA_FS/m5.debug configs/example/fs.py -r N OR build/ALPHA_FS/m5.debug configs/example/fs.py --checkpoint-restore=N
The number N is integer that represents checkpoint number, when they are order lexically (i.e. by the ticknumber) - oldest tick has number 1, next checkpoint has number 2, etc.
Sampling (switching between functional and detailed models) can be implemented via your Python script. In your script you can direct the simulator to switch between two sets of CPUs. To do this, in your script setup a list of tuples of (oldCPU, newCPU). If there are multiple CPUs you wish to switch simultaneously, they can all be added to that list. For example:
run_cpu1 = SimpleCPU() switch_cpu1 = DetailedCPU(defer_registration=True) run_cpu2 = SimpleCPU() switch_cpu2 = FooCPU(defer_registration=True) switch_cpu_list = [(run_cpu1,switch_cpu1),(run_cpu2,switch_cpu2)]
Note that the CPU that does not immediately run should have the parameter "defer_registration=True". This keeps those CPUs from adding themselves to the list of CPUs to run; they will instead get added when you switch them in.
In order for M5 to instantiate all of your CPUs, you must make the CPUs that will be switched in a child of something that is in the configuration hierarchy. Unfortunately at the moment some configuration limitations force the switch CPU to be placed outside of the System object. The Root object is the next most convenient place to place the CPU, as shown below:
root1 = Root() root1.system = System(cpu = run_cpu1) root1.switch_cpu = switch_cpu1 root2 = Root() root2.system = System(cpu = run_cpu2) root2.switch_cpu = switch_cpu2
This will add the swtich CPUs as children of each root object. Note that switch_cpu is not an actual parameter for Root, but is just an assignment to indicate that it has a child, switch_cpu.
After the systems and the CPU list is setup, your script can direct M5 to switch the CPUs at the appropriate cycle. This is achieved by calling switchCpus(cpus_list). For example, assuming the code above, and a system that is setup running run_cpu1 and run_cpu2 initially:
m5.simulate(500) # simulate for 500 cycles m5.switchCpus(switch_cpu_list) m5.simulate(500) # simulate another 500 cycles after switching
Note that M5 may have to simulate for a few cycles prior to switching CPUs due to any outstanding state that may be present in the CPUs being switched out.
How to run multiprogrammed workloads
In SE mode, simply create a system with multiple CPUs and assign a different workload object to each CPU's workload parameter. If you're using the O3 model, you can also assign a vector of workload objects to one CPU, in which case the CPU will run all of the workloads concurrently in SMT mode. Note that SE mode has no thread scheduling; if you need a scheduler, run in FS mode and use the fine scheduler built into the Linux kernel.
How do I terminate multiprogram workloads?
There are some very fundamental issues with whatever approach you choose. Here are your options:
- Terminate as soon as any thread reaches a particular maximum number of instructions. This option is equivalent to max_insts_any_thread. The potential problem here is that because of the inherent non-determinism of multithreaded programs, there is no way to ensure that all experiments do the same work. You might also not get the same amount of work done. For example, if you have two threads, one of them must reach the maximum. The other could either execute no instructions, or could execute max-1 instructions. The benefit of this approach is that all threads are running fully until the simulation terminates (provided that none of the threads terminate early due to some other condition.)
- Terminate once all threads have reached a maximum number of instructions. This option is equivalent to max_insts_all_threads. In this mode, we make sure all threads do at least a certain amount of work, but threads that reach the maximum continue executing. This has the same benefit as the previous example, but also suffers from the problem that non-determinism will cause you to potentially not do the same amount of total work.
- In this unimplemented mode, all threads would run for exactly a specified number of instructions with some threads terminating early. All threads will do the same amount of work thus avoiding the problem of the previous options. The downside of this option is that the threads may not all be running for the entire simulation. For example, one thread might finish its instructions almost right away, while the other thread has quite a bit left to do. When this happens, you're only running a multiprogram workload for a fraction of the total time.
- Another unimplemented option could be to specify how many instructions each thread has to complete before exiting. This is not implemented, but would allow a balance to be struck between options 1 and 2 if the user experimented to figure out what a good mix was.
If you want to implement either of the unimplemented options, or if you have other ideas, please let us know!