README.md

# Omnicrobe Database Data Integration Workflow

[![Snakemake](https://img.shields.io/badge/snakemake-≥6.15.1-brightgreen.svg)](https://snakemake.readthedocs.io)
[![Python](https://img.shields.io/badge/python-≥3.10-brightgreen.svg)](https://www.python.org/)
[![Psycopg2](https://img.shields.io/badge/psycopg2-%E2%89%A52.8.6-brightgreen.svg)](https://pypi.org/project/psycopg2/)

## Source code

```bash
$ git clone git@forgemia.inra.fr:omnicrobe/omnicrobe-database.git
```

## Creation of database tables

```bash
$ psql -h [database host server] -U [database username] -d [database name] -f create.sql
```

*To clean an existing database:*

```bash
$ python scripts/cleaning.py [database name] [database host server] [database password] [database username]
```

## How to use the workflow?

### Configuration file

Before running the Snakemake pipeline, it is necessary to modify the `config.yaml` configuration file (database connection information, file locations, versions).

### Execution environment

```bash
$ conda env create -f environment.yml
```

```bash
$ conda activate env_omnicrobe
```

### Workflow execution

```bash
(env_omnicrobe) $ snakemake --cluster "qsub -V -S /bin/bash -cwd -N OMNICROBE -b y -q {params.queue}" -j 1 --latency-wait 60
```