Persyaratan
**Educational Background and Experiences Qualification**:
Minimum Bachelor's degree in Bioinformatics, Computational Biology, Computer Science, Biology, or a related field.
Minimum 3 years of professional experience in in developing bioinformatics pipelines and software tools ( Whole genome sequencing or whole exome sequencing germline dataset: Bioinformatics/Python/FastQC/bedtools/hisat2/salmon/DESeq2/PyMol)
Experience includes data analysis, statistical methods, and data visualization for genomics, proteomics, or structural biology data.
Experience includes cloud computing, database management, and querying large datasets
Have certifications in relevant bioinformatics tools or technique (Whole genome sequencing or whole exome sequencing germline dataset: Bioinformatics / Python / FastQC / bedtools / hisat2 / salmon / DESeq2 / PyMol)
**Language**: Strong written and spoken in English
Competencies and Skills
Strong foundation in biological concepts, genetics, and molecular biology.
Proficiency in programming languages such as Python, R, and familiarity with data structures and algorithms.
Knowledge of bioinformatics tools, software, and databases commonly used in biological data analysis
Strong problem-solving skills with the ability to approach complex biological and computational challenges creatively.
Excellent communication skills to convey technical information clearly to both technical and non-technical audiences. Tanggung Jawab
The consultant's general responsibilities under this aspect shall include, yet not be limited to, the following:
**Design of Data Processing Workflow for BGI platforms**: Define the comprehensive data processing workflow to handle diverse data sources and generate accurate results. The workflow should encompass the following components:
a.
**Data Ingestion**: Develop a mechanism to efficiently ingest raw sequencing data from Illumina, and Oxford Nanopore Technologies platforms.
b.
**Quality Control**: Implement quality control checks to identify and filter out low-quality reads, adapters, and other artifacts.
c.
**Alignment**: Utilize appropriate alignment algorithms to map sequencing reads to the reference genome, optimizing alignment parameters for each data source.
d.
**Variant Calling**: Develop a robust variant calling strategy to identify single nucleotide variants (SNVs), insertions, deletions, and structural variations.
e.
**Annotation**: Integrate variant annotation tools to provide functional insights into the detected genomic variations.
f.
**Database system Data Storage**: Design a structured database system to efficiently store processed data, ensuring data integrity and easy accessibility. Storage in Pusdatin, MoH has provided slots for BGSI that can be utilized for this database system.
**Development for BGI platforms**: If no existing best practice pipeline meets the specific requirements, consider custom development to build a tailored data processing pipeline. This development phase involves:
a.
**Pipeline Architecture**: Design the architecture of the data processing pipeline, considering modularity, scalability, and maintainability.
b.
**Algorithm Selection**: Choose appropriate algorithms and tools for each processing step based on their compatibility with the data sources and the desired outcomes.
c.
**Custom Scripting**: Write custom scripts to integrate various tools, optimise data flow, and ensure seamless execution of the processing pipeline.
**Pipeline Validation for BGI platforms**: Validate the pipeline using standardised data sets to ensure accuracy and reliability. This validation phase includes:
a.
**Benchmark Data**: Select standardised benchmark data sets with known variants to compare pipeline results against expected outcomes.
b.
**Performance Metrics**: Define performance metrics, such as sensitivity, specificity, and accuracy, to assess the pipeline's performance.
c.