Lucas van Dijk

Web Name: Lucas van Dijk






I am a PhD-candidate in bioinformatics in the Delft BioinformaticsLab at TU Delft and my project is in close collaborationwith the Bacterial Genomics Group at the Broad Instituteof MIT and Harvard, where I do most of my work. My background is in electricalengineering and computer science, and I have gained a huge interest in biologythe past few years. Now I am using my knowledge on algorithms, data structuresand machine learning to help answer biological questions.My research focusses on developing new tools and algorithms to analysebacterial populations. While many bacteria are harmless, some specific strainscan become pathogenic, even though they re considered the same species. Thegoal is to improve our ability to track and dissect the genomes of specificbacterial strains in (DNA) sequencing datasets. This will aid in ourunderstanding how antibiotic resistance arises in bacteria, or how the genesresponsible for antibiotic resistance spread within a patient or hospital.Read the rest of my resume » MSc in Computer Science (Bioinformatics) Delft University of Technology, cum laude Comparing (DNA) sequences is one of the core tasks in bioinformatics and theclassic approach is to align these sequences. This is, however, a relativeslow process and not always computationally feasible, especially if you want tocompare more than two DNA sequences. An alternative approach is comparesequences based on their k-mer profiles.A k-mer of a string $S$ is defined as any substring of $S$ of length $k$. Forexample, the DNA sequence AGCGTATCGATTCA has the following k-mers if $k=6$:AGCGTATCGATTCA--------------AGCGTA GCGTAT CGTATC GATTCAAs you can see, obtaining all k-mers is easy: slide a window of size $k$along your sequence, yielding a k-mer at each position. A sequence of length$L$ has $L - k + 1$ k-mers. A common task is to count how often each k-meroccurs and compare genomes based on these counts. The main idea isthat similar genomes have similar k-mer counts1.When dealing with the scale of genomes, storing counts for all these differentk-mers can take up quite a lot of memory. First, the number of distinctk-mers grows exponentially with the length of $k$. In the case of DNA sequences,our alphabet size is 4: A, C, T, G. Then there are $4^k$possibilities of length $k$. The value of $k$depends on your application and organism, but values ranging from 5 to 32 arecommon.Next, think how we would store the k-mer itself.We could store each letter as ASCII character, requiring 8-bits per character.This is a bit wasteful, however, because in DNA we only have 4 differentcharacters. An optimisation would be to use 2 bits per character: A=00, C=01,T=10, G=11. This would allow us to store a k-mer of length 32 in a 64-bitinteger. Still, this may not be good enough. I ve seen cases for$k=23$ where it went up to more than 100 GB, and that s quite a lot of memoryeven if you have access to a decent compute cluster.This post will explain a technique described in the paper by Cleary et al.2to reduce the memory consumption for storing k-mers. The main insight is thatwe often don t need the exact count of each k-mer, and can take someshortcuts by missing some k-mers. Because of the exponential number ofdifferent k-mers and because genomes are often large, missing a few k-merswill not have a huge impact. Furthermore, when dealing with whole genomesequencing datasets, you also have to deal with sequencing error, and expectsome k-mers to be false. In a lot of cases using approximate k-mer counts isappropriate. On this date 65 years ago, February 1st 1953, the Netherlands experienced itsgreatest flood till date, the North Sea Flood of 1953. Thisis still one of the biggest disasters the Netherlands has ever experienced,with thousands of casualties and lots of people who lost their homes.The Netherlands earns its name from the fact that large parts of the countrylie below sea level. To make sure our country doesn t flood, we have built lotsof barriers, dams and dykes to keep the water out, and we want to preventanything like the flood of 1953 from happening ever again.In the beginning of this year, several of these barriers and dams were put tothe test when a heavy storm reached the Netherlands which resulted in very highwater levels. Our five biggest dams and barriers needed to be closed at thesame time, a first since their construction. We can ask ourselves the questionwhether this will happen more often now that sea levels are rising due toglobal warming 1. A higher base line sea level increases the chancefor even higher water levels when it storms. To get an idea what areas would beaffected the most by a possible flood, we have created a visualisation projectthat shows the height of the Netherlands in comparison to the sea level. Part of my Google Summer of Code project involves porting several arrowheads from Glumpy toVispy. I also want to make a slight change to them:the arrow heads in Glumpy include an arrow body, I want to remove thatto make sure you can put an arrow head on every type of line you want.Making a change like that requires that you understand how those shapesare drawn. And for someone without a background in computer graphics this took some thoroughinvestigation of the code and the techniques used. This article is aimedat people like me: good enough programming skills and linear algebraknowledge, but almost no former experience with OpenGL or computergraphics in general. PHASM: Haplotype-aware de novo genome assembly A de novo genome assembler written in Python that leverages the assembly graph to output DNA sequences for each haplotype. Raspberry Pi TLC5940 library A C++ library to control the TLC5940 LED driver from your Raspberry Pi

TAGS:Lucas van Dijk 

<<< Thank you for your visit >>>

PhD-candidate in bioinformatics

Websites to related :
Lagrande Borne - Tips and Techni

  Nowadays we are living in the digital world even we cannot live without any gadgets like a mobile phone. mobile live casino singapore Using those gadg


  Efter aftale kan bestilte varer afhentes.Internet- og e-mail-ordrer er muligt 24 timer i d gnet.

trash bags, plastic bags, entran

  trash bags, plastic bags, entrance mats, floor mats, aerosols, waxes, floor finishes, absorbent pads, tissue, soap, hand cleaners, insecticides, fogge

Persistent Knowledge Solutions

  SEASONED PROFESSIONALSOur management team has more than 100 man years of experience, excelling in different industry verticals from Retail to Wall St

The Date People | PO Box 808 Nil

  The Date People PO Box 808 Niland, CA 92257, A small farm practicing agroecology and sustainable growing methods t

Welcome to Timber Merchants Loca

  The Timber Merchants Locally is one of the premierinformation resources for finding your own local Timber Merchants Companyin the UK.Hoyland Dismantli

Safe custom designed personalize

  LLC Your Source for Custom Designed Private Tour Services of the West Southwest USA Safe Coronavirus Covid-19 TouringCustom designed private tours a


  Scientific Research We will take the lead in building world-class research institutions Xu Mingwu Professor Phone: 027-87543539 Email: xumingwu@hust

SPFO :: Seamen's Provident Fund

  Shri Mansukh L. MandaviyaHon'ble Minister of State (Independent Charge) Seamen's Provident Fund Organisation The Seamen’s Provident Fund Scheme frame

Home | Smarter Solutions, Inc.

  IEE provides a comprehensive 9-step system that CEOs, Presidents, General Managers, executives, managers, leaders, practitioners, and others can use t


Hot Websites