Getting started.
MolluscaGenes is run from a terminal — a text-only window
that lets you type commands. On macOS that's the
Terminal app (/Applications/Utilities/Terminal.app),
on Linux it's any terminal emulator (GNOME Terminal, Konsole, xterm, …),
and on Windows we recommend the
Windows
Subsystem for Linux with Ubuntu, which gives you a real Linux shell.
The instructions below assume you have a Bash- or zsh-style shell open.
What's a "shell"? The shell is the program that
interprets the commands you type and dispatches them. Everywhere this
tutorial says "in the terminal", you can paste the line into
your shell and press Enter. A line beginning with
$ is a command you type — you don't paste the
$ itself.
One-time setup, in five minutes
- Install conda if you don't already have it. We recommend Miniforge; follow the installer prompts and reopen your terminal afterward.
-
Clone the repository. This downloads the wrappers,
metadata, HMMs, and site code to a folder on your computer:
git clone https://github.com/invertome/molluscagenes cd molluscagenes
git clonecopies the GitHub repo to a new folder namedmolluscagenes;cd("change directory") moves your terminal session into that folder. Every command in this tutorial assumes you are inside it. -
Create the conda environment. This installs every
external tool the wrappers call (BLAST, DIAMOND, HMMER, MAFFT,
ClipKit, IQ-TREE, TreeShrink, plus a few helpers) into an isolated
conda environment named
molluscagenes. It only needs to be done once and takes 5–15 minutes:conda env create -f environment.yml conda activate molluscagenes
Theactivatestep is something you'll need to repeat every time you open a new terminal session. -
Download the database from Zenodo. Pick a folder
with at least 20 GB of free disk for the BLAST
and DIAMOND files (these are too big for GitHub). The
mg_fetch.shwrapper does the download, verifies SHA256 checksums, extracts the tarballs, and writes a configuration file that tells the other wrappers where the data lives:bash wrappers/mg_fetch.sh /path/to/storage
Replace/path/to/storagewith the actual folder you picked. If you want to see what would be downloaded without actually downloading anything, add--dry-run. -
Source the configuration file in every new
terminal session before running any wrapper:
source config.sh
This sets a handful of environment variables ($MG_BLAST_AA,$MG_HMM, …) that the wrappers read. If you forget this step you'll see an "no config.sh found" error.