1 00:00:00,120 --> 00:00:02,399 Male Speaker: Sure. So please join me in welcoming today’s 2 00:00:02,399 --> 00:00:05,240 speaker and my longtime colleague, Dr. Eric Green. 3 00:00:05,240 --> 00:00:05,779 [applause] 4 00:00:05,779 --> 00:00:10,429 Eric Green: Thank you, Andy, it’s a -- it’s a pleasure 5 00:00:10,429 --> 00:00:18,070 to be here. As Andy may have mentioned in week one of this series -- the series actually 6 00:00:18,070 --> 00:00:24,190 started with Andy and I and then eventually bringing Tyra Wolfsberg into this picture. 7 00:00:24,190 --> 00:00:29,020 I’ve -- starting back in 1995, this is like the 12th time I’ve given some version of 8 00:00:29,020 --> 00:00:34,719 this lecture. I should have by the way immediately thanked Tyra and Andy for organizing this 9 00:00:34,719 --> 00:00:40,559 iteration of it, the 2016 version. They included my name on that, which was very kind of them. 10 00:00:40,559 --> 00:00:43,959 It’s purely honorific. I really had nothing to do with organizing this, and you should 11 00:00:43,960 --> 00:00:48,249 thank them for all the logistical aspects of bringing this year’s series together. 12 00:00:48,249 --> 00:00:56,429 I’m more of a legacy left as a named co-organizer, but it is -- it is a pleasure to be here. 13 00:00:56,429 --> 00:01:02,879 I will say that the title of my presentation sort of is very broad with genomics and the 14 00:01:02,879 --> 00:01:07,440 genomics landscape as I see it today. I will point out I’m going to emphasize heavily 15 00:01:07,440 --> 00:01:12,280 the human genomics landscape for reasons that will become pretty obvious, but I’m going 16 00:01:12,280 --> 00:01:17,550 to limit this a little bit, especially towards the end. I should also immediately point out, 17 00:01:17,550 --> 00:01:21,000 as you might imagine, I’m fairly boring, so I have absolutely no relevant financial 18 00:01:21,000 --> 00:01:25,630 relationship with commercial interests, and the other thing I should point out is that 19 00:01:25,630 --> 00:01:32,670 major aspect of what I want to try to accomplish today is really context setting, both for 20 00:01:32,670 --> 00:01:36,430 those of you sort of using this series as an opportunity to learn a lot of genomics 21 00:01:36,430 --> 00:01:41,270 for the first time but also as a framework to some extent for the speakers that will 22 00:01:41,270 --> 00:01:45,140 follow. In fact, usually I give the lead-off talk in the series for that exact reason, 23 00:01:45,140 --> 00:01:49,250 and my schedule just didn’t allow me to go last week, but you will see as I go through 24 00:01:49,250 --> 00:01:51,860 why this is very much of a context setting talk. 25 00:01:51,860 --> 00:01:56,460 I’m going to first start off giving you some historical context for genomics as a 26 00:01:56,460 --> 00:02:00,990 backdrop and then talk about some of the major achievements that have happened since the 27 00:02:00,990 --> 00:02:06,410 Human Genome Project ended 13 years ago, but really emphasize paint the landscape of what 28 00:02:06,410 --> 00:02:13,790 the human genomic circumstance is today and importantly beyond, and as I said, really, 29 00:02:13,790 --> 00:02:17,920 my goal for this as much as anything besides giving sort of a foundation of information 30 00:02:17,920 --> 00:02:23,969 about genomics is to really help you see how the other speakers fit into this landscape 31 00:02:23,969 --> 00:02:25,150 that I’m about to paint. 32 00:02:25,150 --> 00:02:32,870 So, in terms of a historical context, if we really go back, even before genomics was brought 33 00:02:32,870 --> 00:02:38,450 about as a discipline, it is really important to -- and think about the series of major 34 00:02:38,450 --> 00:02:44,819 historical figures and their important contributions to help really fertilize the ground, if you 35 00:02:44,819 --> 00:02:49,420 will, of which genomics was able to grow out of. I could probably spend the whole talking 36 00:02:49,420 --> 00:02:52,200 about that. I’m just going to give what I think are the -- some of the key highlights 37 00:02:52,200 --> 00:02:56,499 to think about. Obviously, Mendel deserves tremendous credit understanding and elucidating 38 00:02:56,499 --> 00:03:01,340 some of the basic laws of inheritance. Of course, he had no idea where those -- that 39 00:03:01,340 --> 00:03:05,769 inheritance was actually coded for or where the information resided. Some of the clues 40 00:03:05,769 --> 00:03:10,620 started to come about with Miescher’s work in the late 1800s when he actually discovered 41 00:03:10,620 --> 00:03:15,749 DNA as a -- as a molecule, but it really wasn’t until Avery and colleagues’ discoveries 42 00:03:15,749 --> 00:03:21,370 of the 1940s that demonstrated that DNA must be this inherited material, which therefore 43 00:03:21,370 --> 00:03:26,950 focused a lot of interest on DNA as an information molecule, setting the stage brilliantly for 44 00:03:26,950 --> 00:03:31,439 what Watson and Crick were able to accomplish in 1953. 45 00:03:31,439 --> 00:03:35,930 In fact, I would contend that the Watson-Crick discovery of the Double-Helical structure 46 00:03:35,930 --> 00:03:43,069 of DNA in 1953 was arguably the most important single biomedical research discovery of the 47 00:03:43,069 --> 00:03:46,790 last century. I would certainly contend that the paper shown to the left was the most important 48 00:03:46,790 --> 00:03:52,499 publication of the last century because what happened with the insights brought about by 49 00:03:52,499 --> 00:03:57,930 the knowledge of the structure of DNA really set the stage for then really figuring out 50 00:03:57,930 --> 00:04:04,319 how DNA was the information molecule of life and how it therefore encoded all the life 51 00:04:04,319 --> 00:04:10,319 processes, if you will. That was of course in the 1950s. It’s then in the 1960s some 52 00:04:10,319 --> 00:04:15,048 of the key things we saw were -- for example, the elucidation of the genetic code. Those 53 00:04:15,049 --> 00:04:18,660 who don’t realize much of that work was done right here on this campus, in fact just 54 00:04:18,660 --> 00:04:23,370 outside of this auditorium is a small museum exhibit talking about some of Marshall Nirenberg’s 55 00:04:23,370 --> 00:04:28,770 work in elucidating this genetic code that we all now take for granted as the key translator 56 00:04:28,770 --> 00:04:35,120 table of going from DNA sequence to protein sequence. Better and better tools started 57 00:04:35,120 --> 00:04:40,550 to come about in the 1970s and particularly the 1980s, leading to DNA cloning, where for 58 00:04:40,550 --> 00:04:47,280 the first time we were able to actually isolate and clone and manipulate DNA in the laboratory 59 00:04:47,280 --> 00:04:52,729 and even then being able to develop methods such as in the late 1970s coming about and 60 00:04:52,729 --> 00:04:58,780 much improved throughout the 1980s to actually read out the Gs, As, Ts, and Cs within DNA. 61 00:04:58,780 --> 00:05:05,780 So, this progression from Mendel all the way through molecular biology and DNA cloning 62 00:05:05,780 --> 00:05:12,150 in many ways then set the stage for what transpired in the late 1980s, and what transpired were 63 00:05:12,150 --> 00:05:17,409 the coming together of all of these tools and technologies to allow us to start thinking 64 00:05:17,409 --> 00:05:24,400 about how to go and study in a more comprehensive way our genomes, and that gave birth to this 65 00:05:24,400 --> 00:05:31,638 field called genomics, and you may not realize that that word didn’t even exist until 1987, 66 00:05:31,639 --> 00:05:37,319 at least not in the scientific or literature. In fact, the first use of the word genomics 67 00:05:37,319 --> 00:05:43,550 in scientific print came about in this lead editorial of a brand new journal called “Genomics,” 68 00:05:43,550 --> 00:05:48,280 1987, where they talked about a new discipline, a new name, a new journal, and in the lead 69 00:05:48,280 --> 00:05:52,909 editorial, they talked about this newly developing discipline of genome mapping and sequencing, 70 00:05:52,909 --> 00:05:58,090 for which they had adopted the term genomics and put this into the scientific literature. 71 00:05:58,090 --> 00:06:02,719 Eighty-seven was particularly relevant for me because it was the year I graduated as 72 00:06:02,720 --> 00:06:08,669 an M.D.-P.h.D. student, reminding myself why I had never heard the word genomics once in 73 00:06:08,669 --> 00:06:12,479 medical school or graduate school because it simply didn’t exist. So, it really is 74 00:06:12,479 --> 00:06:17,688 important to emphasize we’re only sort of 30 years into this word of genomics as a discipline, 75 00:06:17,689 --> 00:06:22,280 so it is a remarkably young discipline, and I think it’s prominence on the biomedical 76 00:06:22,280 --> 00:06:27,539 research stage sometimes confuses us to think that, “Wow, it’s been around forever,” 77 00:06:27,539 --> 00:06:30,539 but it really hasn’t. It actually is a very youthful discipline. 78 00:06:30,539 --> 00:06:34,099 Now, of course, the reason the discipline was named and the reason that there was a 79 00:06:34,099 --> 00:06:40,729 lot of attention in the late 1980s about genomics was because of this idea that was crafted 80 00:06:40,729 --> 00:06:46,349 in the late 1980s and launched in October of 1990, this notion of a Human Genome Project, 81 00:06:46,349 --> 00:06:52,688 this large, audacious, international project that aimed among a number of goals to read 82 00:06:52,689 --> 00:06:58,210 out the three billion Gs, As, Ts, and Cs that constitute the human genome. It’s important 83 00:06:58,210 --> 00:07:03,758 to point out, by the way, that we did have an odometer moment recently, October 1 of 84 00:07:03,759 --> 00:07:09,000 last year, of 2015, marked the 25th anniversary -- 25th anniversary of the launch of the Human 85 00:07:09,000 --> 00:07:14,780 Genome Project. Painful for me to think about because I was a trainee. I feel like it was 86 00:07:14,780 --> 00:07:19,340 just yesterday, but it’s been 25 years since I was there, literally at the starting line 87 00:07:19,340 --> 00:07:24,030 participating in the genome project on day one and involved in it throughout its entire 88 00:07:24,030 --> 00:07:31,770 13-year span. In fact, it was remarkably successful, finished ahead of time, not in the original 89 00:07:31,770 --> 00:07:37,799 15 years, finished in 13 years, and it’s now just sort of a key part of the rich history 90 00:07:37,800 --> 00:07:39,759 of biomedical research. 91 00:07:39,759 --> 00:07:44,710 I actually had the opportunity to co-write a prospective piece that some of you might 92 00:07:44,710 --> 00:07:48,568 be interested in to commemorate the 25th anniversary of the launch of the Human Genome Project, 93 00:07:48,569 --> 00:07:54,060 and I did it with the two individuals who have held the job I now hold, Jim Watson, 94 00:07:54,060 --> 00:07:57,849 the original director of the institute I now lead, and Francis Collins previously was the 95 00:07:57,849 --> 00:08:01,090 director of the institute before I became NIH director, and the three of us wrote this 96 00:08:01,090 --> 00:08:06,388 prospective piece not so much talking about the science of the genome project but talking 97 00:08:06,389 --> 00:08:12,409 about sort of how many legacy elements were left because of what the Human Genome Project 98 00:08:12,409 --> 00:08:17,659 did in terms of changing big biology, if you will, so I put you to that -- point you to 99 00:08:17,659 --> 00:08:20,979 that article if you’re interested in reading some of the historical aspects and importantly 100 00:08:20,979 --> 00:08:23,938 the legacy elements of the genome project beyond the base pairs. 101 00:08:23,939 --> 00:08:29,120 And, in fact, speaking about beyond the base pairs, we are celebrating this odometer moment, 102 00:08:29,120 --> 00:08:34,320 this 25th anniversary in another lecture series. I’m going to put in a shameless plug because, 103 00:08:34,320 --> 00:08:38,240 in fact, we have an ongoing lecture series that takes place right at this podium every 104 00:08:38,240 --> 00:08:42,549 month, and in fact Thursday of this week, which I believe is tomorrow -- yes, it is 105 00:08:42,549 --> 00:08:48,699 tomorrow, Ewan Birney will be speaking here at this podium at 2:00 because we’re bringing 106 00:08:48,700 --> 00:08:55,240 in a series of individuals who were there, heavily involved in creating and executing 107 00:08:55,240 --> 00:09:01,279 the Human Genome Project, including people like Ewan, who really came into prominence 108 00:09:01,279 --> 00:09:05,310 at a very young age to help with the genome project’s end stage and now really used 109 00:09:05,310 --> 00:09:09,268 that as a launching pad for his remarkable career. So, if you’re free at 2:00 p.m. 110 00:09:09,269 --> 00:09:12,600 tomorrow, please come here. If you happen to miss it, of course we video tape this stuff 111 00:09:12,600 --> 00:09:16,440 and make it all available on our genome TV channel of YouTube. 112 00:09:16,440 --> 00:09:22,870 So, it’s been 25 years since the launch of the genome project, and just a little over 113 00:09:22,870 --> 00:09:27,440 that since the beginning of this field of genomics, so to review things I thought I 114 00:09:27,440 --> 00:09:33,720 would just talk about sort of what I think are the six key accomplishments or highlights, 115 00:09:33,720 --> 00:09:39,480 if you will, of genomics in its first quarter century or so of existence and in reviewing 116 00:09:39,480 --> 00:09:45,399 these six areas I want to also contextualize some of the signature efforts that you’ve 117 00:09:45,399 --> 00:09:49,160 probably heard about, or if you haven’t, you should be aware of, that really have been 118 00:09:49,160 --> 00:09:54,800 incredibly important for moving the field forward. A common theme of this will be the 119 00:09:54,800 --> 00:09:58,109 -- some of these efforts are -- most of these efforts are big and they’re audacious and 120 00:09:58,110 --> 00:10:03,199 they’re very much cast in the kind of style that the Human Genome Project was cast in 121 00:10:03,199 --> 00:10:08,010 because it was remarkable what it accomplished, and in fact highlight number one on this six 122 00:10:08,010 --> 00:10:11,569 highlights I’m going to give you is that the human genome was sequenced for the very 123 00:10:11,570 --> 00:10:16,889 first time by the Human Genome Project, and that absolutely is the number one highlight 124 00:10:16,889 --> 00:10:21,990 in many ways of the past 25 years, and it’s a highlight both because it provided such 125 00:10:21,990 --> 00:10:26,350 incredible information about our blueprint, which has launched so many other efforts that 126 00:10:26,350 --> 00:10:30,410 I’m going to talk about, but it’s also important because it launched a whole lot 127 00:10:30,410 --> 00:10:37,790 of other areas, use of genomics as a pivotal tool for advancing those fields. And in fact 128 00:10:37,790 --> 00:10:42,930 this is a subset of those areas, but everyone wanted these areas, had been remarkably advanced 129 00:10:42,930 --> 00:10:50,410 and enriched because genomics has found their way to be impactful in these areas. And every 130 00:10:50,410 --> 00:10:53,969 one of these could be a talk in and of themselves, and probably an entire symposium. I’m not 131 00:10:53,970 --> 00:10:59,620 going to talk about any of this because as I -- although Julie Segre will talk about 132 00:10:59,620 --> 00:11:04,199 and she’ll be one of the speakers in this series on May 18, and will in fact point out 133 00:11:04,199 --> 00:11:08,389 an example of how genomics is really in many ways completely changing the face of diagnostics 134 00:11:08,389 --> 00:11:10,870 when it comes to infectious agents. 135 00:11:10,870 --> 00:11:16,850 My emphasis will be more on human health and human disease and medicine because that’s 136 00:11:16,850 --> 00:11:21,269 the one that in particular is of greatest relevance for us here and NIH in particular 137 00:11:21,269 --> 00:11:26,420 for my institute, the National Human Genome Research Institute, and as Andy mentioned 138 00:11:26,420 --> 00:11:31,519 in the introduction, I’ve been at the institute for about 21 years. I’ve been in the field 139 00:11:31,519 --> 00:11:37,070 since the beginning, but six years ago I became the director, and while I certainly was involved 140 00:11:37,070 --> 00:11:43,110 in thinking about these things when I was in my previous roles at the institute, certainly 141 00:11:43,110 --> 00:11:48,269 when I became director I became increasingly laser focused on thinking about how to facilitate 142 00:11:48,269 --> 00:11:55,130 the application of genomics to health, disease, and medicine, framing it around the concept 143 00:11:55,130 --> 00:11:59,209 of genomic medicine as sort of the ultimate goal, if you will. 144 00:11:59,209 --> 00:12:05,130 By genomic medicine, I mean this as a medical discipline that involves using genomic information 145 00:12:05,130 --> 00:12:09,949 about an individual as part of their clinical care, and of course, important other implications 146 00:12:09,949 --> 00:12:16,880 of that clinical use. This, of course, is largely synonymous with other terms you’ll 147 00:12:16,880 --> 00:12:21,110 hear: individualized medicine, personalized medicine, and later on we’re going to talk 148 00:12:21,110 --> 00:12:27,610 about precision medicine. I would say our framing of this is really very much limited 149 00:12:27,610 --> 00:12:32,300 to genomic information as a subpart of some of these other ways to frame it, but we’re 150 00:12:32,300 --> 00:12:36,529 really going to stay focused on the genomic information as a means of individualizing 151 00:12:36,529 --> 00:12:37,240 care. 152 00:12:37,240 --> 00:12:41,380 So, in thinking about what we want to do as a field, certainly what my institute wants 153 00:12:41,380 --> 00:12:48,920 to do as a research funder, we really think about this as a progression where we need 154 00:12:48,920 --> 00:12:54,969 to traverse a series of accomplishments to eventually see genomic medicine become a reality. 155 00:12:54,970 --> 00:12:58,720 We are grounded heavily in the starting line of the Human Genome Project. That’s what 156 00:12:58,720 --> 00:13:03,230 we really think started all this, which in some ways means the starting line was 13 years 157 00:13:03,230 --> 00:13:08,100 ago because we think once we had a sequence of the human genome, that really set up the 158 00:13:08,100 --> 00:13:12,180 circumstance for accomplishments that I’m about to tell you about, and eventually we’re 159 00:13:12,180 --> 00:13:17,388 going to realize genomic medicine, and we’re going to see the practice of medicine changed 160 00:13:17,389 --> 00:13:20,709 because of the use of genomic information, but this is not going to happen overnight, 161 00:13:20,709 --> 00:13:24,910 and it’s not a sort of a simple, one kind of project effort. It involves many steps. 162 00:13:24,910 --> 00:13:28,469 It’s going to involve many countries. It’s going to involves thousands of scientists. 163 00:13:28,470 --> 00:13:32,630 It’s going to require an amazing amount of creativity, and we can’t even anticipate 164 00:13:32,630 --> 00:13:35,680 all the things we’re going to need, although we can anticipate some of them, and I think 165 00:13:35,680 --> 00:13:38,089 we’ve done a lot in the last 13 years. 166 00:13:38,089 --> 00:13:43,259 I also want to emphasize this is going to require a community, a highly interdisciplinary 167 00:13:43,259 --> 00:13:50,540 community of scientist, health-care professionals, and people in all -- all people that are touching 168 00:13:50,540 --> 00:13:55,269 health care in some form and science and research, and they’re going to all have to be very, 169 00:13:55,269 --> 00:13:59,519 very highly collegial, and we’re going to have to be doing this together for a long 170 00:13:59,519 --> 00:14:01,940 time. The analogy I’ve been using is one of a marathon. I mean, really, we’re going 171 00:14:01,940 --> 00:14:05,550 to have a lot of people running shoulder to shoulder. It is not a sprint, and we have 172 00:14:05,550 --> 00:14:08,859 to be in this for the long haul because there’s a lot complexities, some of which I’ll be 173 00:14:08,860 --> 00:14:13,850 unpacking throughout my talk. But that’s a tall order. I mean, you know, sort of thinking 174 00:14:13,850 --> 00:14:18,259 about how do you go from sort of the base pairs provided by the genome project to actually 175 00:14:18,259 --> 00:14:22,139 change how we take care of patients at the bedside or maybe if you prefer the metaphor 176 00:14:22,139 --> 00:14:26,550 from double helix to human health, you know, that’s going to require some pretty clear 177 00:14:26,550 --> 00:14:31,349 and important strategic thinking around this. And there’s one thing the genomics community 178 00:14:31,350 --> 00:14:35,690 I think is really good at is I think we’re really good at being strategic, and we’re 179 00:14:35,690 --> 00:14:39,230 also very good at sort of organizing how we want to pursue things, and we’ll sort of 180 00:14:39,230 --> 00:14:43,279 in the fabric of what we did during the genome project. In fact, the way we accomplished 181 00:14:43,279 --> 00:14:49,290 the genome project was to every couple of years develop a new strategic plan that would 182 00:14:49,290 --> 00:14:53,529 guide the next few years and be willing to, by the way, rip up a strategic plan when it 183 00:14:53,529 --> 00:14:56,130 seemed outdated after a few years and come up with a new one. 184 00:14:56,130 --> 00:15:01,230 So, since that was sort of -- sort of culturally what we did to sort of map out the next set 185 00:15:01,230 --> 00:15:04,269 of things that needed to be accomplished, it probably wasn’t surprising that literally 186 00:15:04,269 --> 00:15:11,170 the day the genome project ended 13 -- nearly 13 years ago, our institute published following 187 00:15:11,170 --> 00:15:15,469 an incredible amount of consultation with the community a strategic plan for the future 188 00:15:15,470 --> 00:15:20,610 of genomics immediately starting after the genome project was completed, and it served 189 00:15:20,610 --> 00:15:25,199 us well, I will tell you, for a number of years, but it -- but like many things, when 190 00:15:25,199 --> 00:15:29,310 there’s scientific advances it doesn’t serve you well forever because new opportunities 191 00:15:29,310 --> 00:15:37,768 come up, and in fact we saw those new opportunities by around the end of 2009, 2010 in particular, 192 00:15:37,769 --> 00:15:41,500 and we recognized it was time for a new strategic vision or updated strategic vision, in which 193 00:15:41,500 --> 00:15:46,629 we put out, once again co-authored by members of the institute, again involving a lot of 194 00:15:46,629 --> 00:15:49,019 strategic consultation with the community. 195 00:15:49,019 --> 00:15:52,920 The big difference for the first time with our 2011 strategic plan, which is the one 196 00:15:52,920 --> 00:15:59,040 we still use, was the incorporation of genomic medicine as a key element, as a key goal, 197 00:15:59,040 --> 00:16:04,589 as I’ve articulated to you, in fact putting it in the title of that strategic plan. So, 198 00:16:04,589 --> 00:16:10,060 for those of you who have not read this, it is, you know, five years old, and we’ve 199 00:16:10,060 --> 00:16:14,069 really looked critically at it very recently and actually still feel it has a very, very, 200 00:16:14,069 --> 00:16:18,810 very good shelf life in terms of just still being very fresh and robust. I guess I’m 201 00:16:18,810 --> 00:16:23,479 not allowed to give out required reading for this class, but I will tell you that a lot 202 00:16:23,480 --> 00:16:27,160 of things that we say in the strategic plan will be on the test, so for those of you who 203 00:16:27,160 --> 00:16:30,269 really want to know what’s on the test at the end of this -- what, three people just 204 00:16:30,269 --> 00:16:34,029 left because they thought there really is a test. They just dropped the class. No, I 205 00:16:34,029 --> 00:16:37,300 can’t make mandatory reading, but I would strongly encourage you, if you’re here to 206 00:16:37,300 --> 00:16:41,628 learn about genomics, this would be a great article to read. Even though it’s 2011, 207 00:16:41,629 --> 00:16:45,360 it’s still quite relevant to everything I’m talking about and many things I don’t 208 00:16:45,360 --> 00:16:49,120 have time to talk about. If you want to quickly download it, you can get to the PDF immediately 209 00:16:49,120 --> 00:16:52,620 by going to the URL, but don’t read it while I’m talking because I have a lot of things 210 00:16:52,620 --> 00:16:54,600 I want to cover. You can read it afterwards. 211 00:16:54,600 --> 00:17:00,600 But let me just give you a general overview of what the strategic plan describes as a 212 00:17:00,600 --> 00:17:08,039 -- as a framework because in fact the framework that we put forth in this serves as an organizing 213 00:17:08,039 --> 00:17:11,770 principles in many ways for almost everything we’re doing at our institute, and I think 214 00:17:11,770 --> 00:17:16,879 in many ways is a nice framing of many of the things going on here in the entire field 215 00:17:16,880 --> 00:17:23,339 of genomics because what we heard during our strategic planning that led up to the 2011 216 00:17:23,339 --> 00:17:28,740 publication was that it was finally time for the genomics community to be more specific 217 00:17:28,740 --> 00:17:33,710 and more sophisticated in describing how they were going to actually go from basic genomic 218 00:17:33,710 --> 00:17:37,320 information to actually changing the practice in medicine. It was always a thing we would 219 00:17:37,320 --> 00:17:41,689 say during the genome project, you know, “One day, this will be really important for how 220 00:17:41,690 --> 00:17:46,350 we practice medicine,” but now, 2011, it was time to actually start to describe a research 221 00:17:46,350 --> 00:17:50,600 agenda that would inch you closer and closer to actually changing the practice in medicine. 222 00:17:50,600 --> 00:17:56,350 It was important to organize the thinking and programmatically important to know how 223 00:17:56,350 --> 00:18:00,850 you were going to develop research programs that helped with this progression from left 224 00:18:00,850 --> 00:18:04,230 to right, so at the end of the day, we found that we could describe everything we needed 225 00:18:04,230 --> 00:18:09,190 to do or most things we needed to do in five major bins of activities or domains, as we 226 00:18:09,190 --> 00:18:09,890 called them. 227 00:18:09,890 --> 00:18:13,390 Let me introduce you to those domains. I mean, one of the -- the first one was saying we 228 00:18:13,390 --> 00:18:17,320 were very familiar with understanding the structure of genomes, largely what we had 229 00:18:17,320 --> 00:18:22,250 done during the genome project and the immediate period beyond, but also recognizing that we 230 00:18:22,250 --> 00:18:27,210 also needed to understand the biology of genomes, how those Gs, As, Ts, and Cs did all of their 231 00:18:27,210 --> 00:18:32,860 work, and that was an important thing, ongoing, but was a domain of research activities, and 232 00:18:32,860 --> 00:18:37,199 with knowledge then about how the genome works, it provides you opportunities to use genomics 233 00:18:37,200 --> 00:18:41,539 to then understand the biology of disease. How is it that changes in our genomes influence 234 00:18:41,539 --> 00:18:48,039 our health and well-being? And having a clear domain focused around human disease was very 235 00:18:48,039 --> 00:18:52,669 important. Obviously, if you start to get insights about human disease, it provides 236 00:18:52,669 --> 00:18:56,909 you the opportunity to think about how to advance medical science, and that would involve 237 00:18:56,909 --> 00:19:02,030 clinical research that would eventually give you an ability to think about how to use genomics 238 00:19:02,030 --> 00:19:07,080 to maybe have a more sophisticated approach to medicine, but just because you have better 239 00:19:07,080 --> 00:19:11,230 ways of practicing medicine doesn’t necessarily mean you’ve proven that you’re going to 240 00:19:11,230 --> 00:19:17,299 change how well health care works, so we also sort of put down as a domain of responsibility 241 00:19:17,299 --> 00:19:21,179 in many ways for doing research to eventually demonstrate that you can actually improve 242 00:19:21,179 --> 00:19:24,990 the effectiveness of how you’re caring for your caring for your patients using genomic 243 00:19:24,990 --> 00:19:25,750 approaches. 244 00:19:25,750 --> 00:19:30,909 So, these five domains really do represent what we think about, and I think as I will 245 00:19:30,909 --> 00:19:36,770 now continue to describe my highlights of the last quarter century, you will see how 246 00:19:36,770 --> 00:19:42,110 we are moving from left to right on that progression through these series of domains, eventually 247 00:19:42,110 --> 00:19:47,340 finding ourselves thinking about medical science and hopefully eventually improving the effectiveness 248 00:19:47,340 --> 00:19:52,309 of health care. So, with that as a backdrop, let me continue with my highlights of the 249 00:19:52,309 --> 00:19:58,490 last quarter century of genomics. Well, we sequenced the human genome for the first time 250 00:19:58,490 --> 00:20:03,029 in the Human Genome Project, but, while that was incredibly satisfying, we were thinking 251 00:20:03,029 --> 00:20:09,070 about one day sequencing patients’ genomes, and to do that we needed to make sure that 252 00:20:09,070 --> 00:20:13,629 we could cost-effectively and highly accurately sequence people’s genomes, not just once 253 00:20:13,630 --> 00:20:20,059 by many, many, many times. Well, to accomplish that, we needed to very much reduce the cost 254 00:20:20,059 --> 00:20:25,070 of sequencing. The good news is, we’ve done it. In fact, the cost for sequencing the human 255 00:20:25,070 --> 00:20:31,389 genome has been reduced nearly a million fold over -- since the Human Genome Project’s 256 00:20:31,390 --> 00:20:34,070 first sequencing of the human genome. 257 00:20:34,070 --> 00:20:42,158 Now, that didn’t happen by accident, and in fact our institute deserves, I think, some 258 00:20:42,159 --> 00:20:47,130 credit, although not exclusive credit, but some credit of recognizing that this was pivotally 259 00:20:47,130 --> 00:20:51,299 important for the -- what needed to happen once the genome project was completed. And 260 00:20:51,299 --> 00:20:57,529 in fact in the strategic plan that we wrote and published the day the genome project ended, 261 00:20:57,529 --> 00:21:00,529 we said a lot of things in that strategic plan, but one of the things of relevance here 262 00:21:00,529 --> 00:21:05,970 is we talked about “…technological leaps that seemed so far off as to be almost fictional 263 00:21:05,970 --> 00:21:10,529 but which, if they could be achieved, would revolutionize biomedical research and clinical 264 00:21:10,529 --> 00:21:15,169 practice.” And we gave as an example -- we gave several examples, but the key example 265 00:21:15,169 --> 00:21:19,210 relevant here was -- an example was, “…the ability to sequence DNA at costs that are 266 00:21:19,210 --> 00:21:23,770 lower by 45 orders of magnitude than the current cost, allowing a human genome to be sequences 267 00:21:23,770 --> 00:21:25,260 for $1000 or less.” 268 00:21:25,260 --> 00:21:32,039 So, here it was. The genome project ended, and we put our names on a nature paper that 269 00:21:32,039 --> 00:21:36,000 said, “We need to now go out and figure out how to sequence a human genome for $1000.” 270 00:21:36,000 --> 00:21:42,450 It was a rather audacious claim, considering that day marked the final day where we had 271 00:21:42,450 --> 00:21:47,130 finished sequencing the first human genome, but when we added up all the costs associated 272 00:21:47,130 --> 00:21:51,210 with sequencing that first human genome, it came in at something like $1 billion, and 273 00:21:51,210 --> 00:21:55,159 people will argue about whether it was $600 million, $700 million, $800 -- I just round 274 00:21:55,159 --> 00:21:59,090 it up, roughly $1 billion, and here we were on that day proposing, “Oh, we just need 275 00:21:59,090 --> 00:22:02,959 to knock six zeroes off of that figure and eventually come up with a thousand-dollar 276 00:22:02,960 --> 00:22:08,929 genome,” and while it was audacious, it was catalytic. Now, meanwhile, it was catalytic 277 00:22:08,929 --> 00:22:13,320 because we decided as an institute to put out a major granting program, and that major 278 00:22:13,320 --> 00:22:19,918 granting program aimed to collect great ideas from creative scientists around the world 279 00:22:19,919 --> 00:22:24,679 actually. Their goal was basically to get rid of this, because this was the factories 280 00:22:24,679 --> 00:22:28,490 that were used for sequencing that first human genome, which was one of multiple factories 281 00:22:28,490 --> 00:22:32,250 that were used for sequencing the first genome, and we wanted creative people to come up with 282 00:22:32,250 --> 00:22:38,600 some fancy-schmancy way to sequence DNA, shown here in iconic form, as something magical 283 00:22:38,600 --> 00:22:45,570 and revolutionary that would knock six zeroes off of that figure, and the good news is not 284 00:22:45,570 --> 00:22:49,520 only did we get creative scientists to come in, and we gave grants to, and they did remarkable 285 00:22:49,520 --> 00:22:54,240 high-risk things, many of which paid off. The good news was that the private sector 286 00:22:54,240 --> 00:22:58,260 met us as partners, and the private sectors recognized this was what was important, too, 287 00:22:58,260 --> 00:23:02,158 and there was a considerable amount of private sector investment as well on in many cases 288 00:23:02,159 --> 00:23:06,679 commercializing things that came out of our scientists, our funded scientists’ efforts. 289 00:23:06,679 --> 00:23:10,809 And the rest is history. I mean, it’s 13 years, but the rest is history. This has been 290 00:23:10,809 --> 00:23:15,360 chronicled in nature articles talking about our -- this program and these efforts, and 291 00:23:15,360 --> 00:23:19,010 the graph on the left is sort of an iconic graph that we put out, and we have a whole 292 00:23:19,010 --> 00:23:24,480 webpage that catalogues and has been cataloguing the cost of sequencing by -- especially by 293 00:23:24,480 --> 00:23:29,809 the centers that we support for big, large-scale sequencing, and in green is the cost of sequencing 294 00:23:29,809 --> 00:23:33,629 on a logarithmic scale, the cost of sequencing a human genome. It’s just fallen precipitously, 295 00:23:33,630 --> 00:23:40,110 and it’s because of fancy, wonderful new technologies such as those shown on the right, 296 00:23:40,110 --> 00:23:45,229 and it’s not just the fact that we are getting really close to a thousand-dollar genome because 297 00:23:45,230 --> 00:23:48,570 that’s one thing we are, and that’s where it’s almost a million-fold reduction. 298 00:23:48,570 --> 00:23:52,770 It’s that it’s also how quickly we can sequence genomes, so to give you a perspective, 299 00:23:52,770 --> 00:23:55,820 you know, that first human genome sequence as part of the genome project cost something 300 00:23:55,820 --> 00:24:01,149 like a billion dollars, but it also took six to eight years of active sequencing. That’s 301 00:24:01,149 --> 00:24:08,070 a long time to get a sequence, but today, using new methods, only a few thousand dollars. 302 00:24:08,070 --> 00:24:12,549 We’re getting close to a thousand, but you can also do this in about a day or two, and 303 00:24:12,549 --> 00:24:17,039 in fact there’s many believe using new protocols we will get this down to a day or less than 304 00:24:17,039 --> 00:24:22,190 a day in the coming year. And the other thing that’s really exciting about this and these 305 00:24:22,190 --> 00:24:26,510 technologies is that whatever you think we use today, it probably won’t be what we’re 306 00:24:26,510 --> 00:24:30,240 using two or three years from now. It is very much like sitting in an airport where you 307 00:24:30,240 --> 00:24:33,710 know you have a lot of nice planes on the ground ,but you look over on the horizon, 308 00:24:33,710 --> 00:24:37,649 and there’s more planes coming, and there’s another and then there’s another one. I 309 00:24:37,649 --> 00:24:41,370 happen to know there’s some really cool technologies that are sort of early stage, 310 00:24:41,370 --> 00:24:44,489 aren’t ready to be commercialized, but, you know, they’re about the second or third 311 00:24:44,490 --> 00:24:48,669 plane in, and probably within about a year or two they will be supplanting, I think, 312 00:24:48,669 --> 00:24:50,549 some other -- the technologies that we currently use today. 313 00:24:50,549 --> 00:24:55,260 So it’s a very, very exciting time, and it’s not letting up, and in fact there’s 314 00:24:55,260 --> 00:24:59,658 a lot of excitement over nanopores. That’s the latest rage, and devices such as that 315 00:24:59,659 --> 00:25:04,580 shown here on the bottom right that literally plug into the USB port of your laptop and 316 00:25:04,580 --> 00:25:08,908 can sequence DNA and maybe be able to sequence a human genome within a day if all things 317 00:25:08,909 --> 00:25:15,039 work out. Remarkable exciting developments, and we just sort of stand back and watch this 318 00:25:15,039 --> 00:25:20,309 happen, and so there’s a lot to be described in these new technologies, and in fact one 319 00:25:20,309 --> 00:25:24,760 of the most popular lectures in the series of late has been Elaine Mardis, who’s kind 320 00:25:24,760 --> 00:25:29,100 enough to fly here and give a lecture, a real leader in sequencing technology, and she’ll 321 00:25:29,100 --> 00:25:33,250 be here May 25, and it’s not a lecture you want to miss because I think it always gets 322 00:25:33,250 --> 00:25:37,220 the highest YouTube hits if I’m correct. And she will -- she will describe what I just 323 00:25:37,220 --> 00:25:40,840 gave in three slides and she’ll talk about it for over an hour, and there’s a lot to 324 00:25:40,840 --> 00:25:43,949 talk about in DNA sequencing technologies. 325 00:25:43,950 --> 00:25:49,929 I do want to point out because some of this talk is a little philosophical how important 326 00:25:49,929 --> 00:25:53,799 these technological advances have been for the field of genomics, you know, and in the 327 00:25:53,799 --> 00:25:59,020 history of science, you’ve often seen major inflections influxions in scientific progress 328 00:25:59,020 --> 00:26:03,010 because of technological advances. I’ll give you a few. You know, needless to say 329 00:26:03,010 --> 00:26:07,260 the telescope sort of changed the face of astronomy. The microscope changed the face 330 00:26:07,260 --> 00:26:13,429 of cell biology, and certainly devices such as the shown on the left, various radiographic 331 00:26:13,429 --> 00:26:18,850 methods really changed the face of radiology as we know it, and trust me, that’s exactly 332 00:26:18,850 --> 00:26:23,360 what is happening with these new instruments. Technologies for sequencing DNA in that last 333 00:26:23,360 --> 00:26:27,309 13 years have completely changed the face of genomics. In fact, I think it’s changing 334 00:26:27,309 --> 00:26:32,559 the face of biomedical research as genomic sort of permeates across the entire enterprise 335 00:26:32,559 --> 00:26:33,740 of biomedicine. 336 00:26:33,740 --> 00:26:39,070 So, that’s a great accomplishment. Elaine will tell you more. Well, that’s great. 337 00:26:39,070 --> 00:26:45,529 Now we can sequence genomes quite inexpensively, and that we particularly want to do because 338 00:26:45,529 --> 00:26:50,409 we want to sequence many people’s genomes, and now we can afford to do it, and it’s 339 00:26:50,409 --> 00:26:54,200 -- and the reason we want to do it is we’re not just interested in that first reference 340 00:26:54,200 --> 00:26:57,940 sequence. That just sort of gave us a hypothetical individual. It wasn’t even one person. It 341 00:26:57,940 --> 00:27:04,070 was a patchwork of people. It was a reference. We in fact want to sequence hundreds, thousands, 342 00:27:04,070 --> 00:27:07,710 tens of thousands, eventually hundreds of thousands of people because we want to figure 343 00:27:07,710 --> 00:27:14,090 out how all of us are different. The good news is that even so far we’ve already sequenced 344 00:27:14,090 --> 00:27:17,529 now tens of thousands, and I probably have to change this slide fairly soon. That will 345 00:27:17,529 --> 00:27:21,029 have to say hundreds of thousands of human genomes have actually been sequenced in one 346 00:27:21,029 --> 00:27:27,179 form or the other worldwide, and that is providing us a very remarkable opportunity to understand 347 00:27:27,179 --> 00:27:29,330 how we all have different blueprints. 348 00:27:29,330 --> 00:27:34,820 So, let me just remind you any one -- any two of us differ about every one out of a 349 00:27:34,820 --> 00:27:41,799 thousand bases as you go across all the letters in your genome. Those differences are variants, 350 00:27:41,799 --> 00:27:45,299 at least depicted here a single nucleotide variance, a G where other people might have 351 00:27:45,299 --> 00:27:49,820 a C or an A or somebody else might have a T, and so forth, and so we have millions of 352 00:27:49,820 --> 00:27:54,629 those variants compared to any referents or compared to the person sitting next to you, 353 00:27:54,630 --> 00:27:59,720 but the great majority of those variants are inconsequential from a biological, a medical 354 00:27:59,720 --> 00:28:05,059 perspective, but a subset are very, very consequential, and we want to know those, and by the way, 355 00:28:05,059 --> 00:28:08,789 it’s not that you all have your own private set of millions of variants and nobody -- most 356 00:28:08,789 --> 00:28:12,408 of the variants you have are very common, and other people have -- probably other people 357 00:28:12,409 --> 00:28:19,760 in this room have, and that makes it a situation where we could imagine if their very -- relatively 358 00:28:19,760 --> 00:28:23,140 common, if we just sequenced enough people, we could develop catalogs of those variants, 359 00:28:23,140 --> 00:28:26,630 and then if we had catalogs of those variants and we had really good methods, we could probably 360 00:28:26,630 --> 00:28:30,559 start figuring out which ones are consequential and which ones are not and which ones might 361 00:28:30,559 --> 00:28:35,158 be not so good variants, might confer a risk for disease and which ones might be good variants 362 00:28:35,159 --> 00:28:39,390 because they maybe protect you from a disease. They may be associated with some other positive 363 00:28:39,390 --> 00:28:40,559 attribute. 364 00:28:40,559 --> 00:28:45,940 And so, as a result, having sequenced that first human genome, there was remarkable motivation 365 00:28:45,940 --> 00:28:52,480 to start cataloging common variants in the human population. You might have heard about 366 00:28:52,480 --> 00:29:00,019 the desire to get single nucleotide polymorphisms and indeed -- or SNPs -- and the first effort 367 00:29:00,019 --> 00:29:03,460 was something called the SNP Consortium, but that quickly, once it started to get some 368 00:29:03,460 --> 00:29:07,909 traction, gave rise to something called the International HapMap Project, which attempted 369 00:29:07,909 --> 00:29:13,700 to not only catalog these SNPs, these single nucleotide variants, but also to help us figure 370 00:29:13,700 --> 00:29:18,360 out how they relate to one another on blocks of DNA called haplotypes, on human chromosomes, 371 00:29:18,360 --> 00:29:22,490 because it turns out that not all these variants just sort of go all different directions as 372 00:29:22,490 --> 00:29:25,940 they’re inherited, but rather they have neighborhoods of variants that have many variants 373 00:29:25,940 --> 00:29:30,470 in a big block of DNA on a chromosome -- tend to stick together as they get inherited from 374 00:29:30,470 --> 00:29:35,409 one generation to the next. And so through a series of publications, the last one being 375 00:29:35,409 --> 00:29:43,190 in 2010, a lot of information about SNPs, single nucleotide variants and their haplotype 376 00:29:43,190 --> 00:29:47,080 structure, was elucidated and shared with the biomedical research community. 377 00:29:47,080 --> 00:29:53,549 At right around 2010 or even a couple years before then, new methods for sequencing DNA 378 00:29:53,549 --> 00:29:58,450 came about that allowed us to really accelerate the pace of discovering variants in the human 379 00:29:58,450 --> 00:30:04,390 population and that gave rise to the signature project, the 1,000 Genomes Project, which 380 00:30:04,390 --> 00:30:08,580 was another audacious, large, international effort like the HapMap Project and like the 381 00:30:08,580 --> 00:30:13,250 Human Genome Project, to catalog the most common variants across the world actually 382 00:30:13,250 --> 00:30:19,899 and you can see from a collection of -- by the way, a lot of times in genomics we overachieve 383 00:30:19,899 --> 00:30:23,750 and so we originally named the project 1,000 Genomes and quickly overachieved. So over 384 00:30:23,750 --> 00:30:29,230 2,500 genomes were sequenced in the end, collected from 26 populations in the world, initially 385 00:30:29,230 --> 00:30:36,659 described in this “marker paper,” as they’re called, in 2010 and then finally culminating 386 00:30:36,659 --> 00:30:41,909 in this remarkable paper coming out last October that’s sort of the final paper -- final 387 00:30:41,909 --> 00:30:47,460 major paper of the 1,000 Genomes Project, and what’s remarkable about their effort 388 00:30:47,460 --> 00:30:51,240 is that where once upon a time when the Genome Project ended and we had information about 389 00:30:51,240 --> 00:30:55,690 maybe thousands, tens of thousands maybe of variants that we knew existed in the human 390 00:30:55,690 --> 00:31:00,990 population at specific points in the genome, 1,000 Genomes had sort of gotten us up to 391 00:31:00,990 --> 00:31:02,450 a much higher threshold. 392 00:31:02,450 --> 00:31:06,649 In fact, they got us to the point where there are about 90 million places in the human genome 393 00:31:06,649 --> 00:31:11,850 we now know are variant across the human population and we know the variants that sit at those 394 00:31:11,850 --> 00:31:17,139 particular sites. So we went from tens of thousands to nearly 90 million variants of 395 00:31:17,139 --> 00:31:22,139 sites that we know exist and that gave rich, rich, rich catalogs of information that could 396 00:31:22,139 --> 00:31:26,209 then be used for scientists to test which of those variants are important, and so Lynn 397 00:31:26,210 --> 00:31:30,960 Jorde -- I will once again take the last three slides I gave and unpack it in much greater 398 00:31:30,960 --> 00:31:35,639 detail when he is here on April 20 talking about population genomics and some of these 399 00:31:35,639 --> 00:31:40,379 efforts that I just quickly described to you. Now the other interesting thing about the 400 00:31:40,380 --> 00:31:46,110 ability to sequence tens of thousands of human genomes is it begins to give us insight about 401 00:31:46,110 --> 00:31:50,860 what any one of our genomes look like, because one of the things we’re always curious about 402 00:31:50,860 --> 00:31:54,740 is when we eventually get to this point of using genomics to take care of patients, we’re 403 00:31:54,740 --> 00:31:58,750 going to want to know what is a typical patient’s genome, what’s it like, and what can we 404 00:31:58,750 --> 00:32:01,659 learn from it. So we’re starting to learn this and a lot more, but I just thought it’s 405 00:32:01,659 --> 00:32:06,350 fun numbers to put in the back of your head: what does your genome look like by the number 406 00:32:06,350 --> 00:32:08,990 and if you ever get your genome sequenced you’ll want to know some of these numbers. 407 00:32:08,990 --> 00:32:14,679 I mean for example, you have six million nucleotides roughly in your genome, right, because there’s 408 00:32:14,679 --> 00:32:18,950 three billion as the reference sequence, three billion nucleotides, but you have two genomes, 409 00:32:18,950 --> 00:32:22,860 right? You got one for mom, one for dad, so when we sequence a person’s genome we’re 410 00:32:22,860 --> 00:32:28,209 actually sequencing six billion nucleotides or getting information on six billion nucleotides. 411 00:32:28,210 --> 00:32:32,399 But a typical person -- on average when you sequence there are six billion nucleotides 412 00:32:32,399 --> 00:32:36,840 -- they have about three to five million single nucleotide variants and if you do the arithmetic, 413 00:32:36,840 --> 00:32:40,789 that’s about what we expected. So compared to the person sitting next to you, there’s 414 00:32:40,789 --> 00:32:45,809 about three to five million differences between your two genome sequences and as I told you 415 00:32:45,809 --> 00:32:50,080 earlier, most of the variants you have, in fact the great, great, great majority of them 416 00:32:50,080 --> 00:32:54,010 are common. We already know about them, they’re in the databases, you could open up a browser 417 00:32:54,010 --> 00:32:59,110 and go to a DDA to resource and you will find that variant, but that’s not all of them, 418 00:32:59,110 --> 00:33:03,459 because about 150,000 of those variants, a minority of your three to five million, are 419 00:33:03,460 --> 00:33:04,929 not in databases yet. 420 00:33:04,929 --> 00:33:08,929 So every single time we still sequence a new human genome, we come up with new variants. 421 00:33:08,929 --> 00:33:12,289 Those are the very rare variants, but they’re still worth having and we keep collecting 422 00:33:12,289 --> 00:33:16,330 them and actually what’s very interesting is that when we sequence a given person’s 423 00:33:16,330 --> 00:33:21,418 genome, on average we’ll find about 60 such variants that did not exist in either parent 424 00:33:21,419 --> 00:33:25,940 and of course this is how new variants get created and in the process of creating the 425 00:33:25,940 --> 00:33:29,649 two germ cells that gave rise to you, all that DNA had to be replicated and while there’s 426 00:33:29,649 --> 00:33:34,018 a lot of DNA repair going on, there are some oops’ that happen and each of you is associated 427 00:33:34,019 --> 00:33:39,000 with about 60 oops’. Most of those, I’m sure the great majority, are completely inconsequential, 428 00:33:39,000 --> 00:33:44,649 but occasionally this is how you end up with a genetic disorder in a child that -- where 429 00:33:44,649 --> 00:33:48,090 it didn’t come -- wasn’t inherited from either parent, because it came about brand 430 00:33:48,090 --> 00:33:51,850 new and that would be an example of that, but again the majority of these differences 431 00:33:51,850 --> 00:33:57,769 are completely inconsequential to your health and wellbeing. So that’s just a little aside 432 00:33:57,769 --> 00:34:01,639 -- and a lot more being learned about how many of your repertoire of genes -- how many 433 00:34:01,639 --> 00:34:04,949 times your genes mutated and broke and so forth and we’re learning a lot about that 434 00:34:04,950 --> 00:34:06,919 for the average patient. 435 00:34:06,919 --> 00:34:12,000 Well, having had sequenced the first human genome, developing ways to sequence genomes 436 00:34:12,000 --> 00:34:15,460 cheaper and cheaper and then going out and actually sequencing many, many genomes and 437 00:34:15,460 --> 00:34:21,610 getting lots of knowledge about variation in the genome, it immediately started to -- already 438 00:34:21,610 --> 00:34:26,180 was happening in parallel -- wanting to know, okay, well when you have a sequence difference, 439 00:34:26,179 --> 00:34:32,100 what does it do? How does it influence the viability or any -- in the development -- any 440 00:34:32,100 --> 00:34:37,810 aspect of a creature, in this case of a human? So to do that we really needed to understand 441 00:34:37,810 --> 00:34:43,820 how it is that the human genome actually functions and I would tell you 13 years having -- following 442 00:34:43,820 --> 00:34:47,440 the end of the Genome Project, there have been profound advances in understanding how 443 00:34:47,440 --> 00:34:53,730 the human genome actually functions. And let me just remind you that that was not what 444 00:34:53,730 --> 00:34:56,590 the Genome Project was supposed to do. This is what the Genome Project was supposed to 445 00:34:56,590 --> 00:35:01,190 do and they did it, they basically read out all these letters -- of course this is only 446 00:35:01,190 --> 00:35:05,710 a subset of what the Genome -- this is only .00001 percent of what the Genome Project 447 00:35:05,710 --> 00:35:12,880 did and it’s a complicated language, that is very hard to sort of immediately grasp 448 00:35:12,880 --> 00:35:18,400 where is the important parts, and I will tell you that when the Genome Project ended 13 449 00:35:18,400 --> 00:35:23,330 years ago, our tools for actually interpreting the three billion letters were really -- really 450 00:35:23,330 --> 00:35:25,890 weren’t that great, they were quite nascent. 451 00:35:25,890 --> 00:35:33,129 They were not bad for genes but for understanding non-gene -- the functional parts of the genome 452 00:35:33,130 --> 00:35:36,890 that are not genes, were actually quite weak and we had a lot of work to do, but we did 453 00:35:36,890 --> 00:35:43,200 know a thing or two about genes and in fact we knew that genes had introns and exons and 454 00:35:43,200 --> 00:35:48,410 we knew that DNA got made into RNA and that that RNA could be alternately spliced and 455 00:35:48,410 --> 00:35:52,720 you can get different gene products as a result of that and we were armed with that genetic 456 00:35:52,720 --> 00:35:56,810 code I told you about earlier, so at least when it came to the protein-coding genes we 457 00:35:56,810 --> 00:36:00,400 could look up and figure out how they made proteins. So we were pretty good shape for 458 00:36:00,400 --> 00:36:05,440 genes and so we went to it -- and when I say “we” I mean the community went to it -- and 459 00:36:05,440 --> 00:36:08,700 went through and quickly highlighted all the parts of the human genome that looked like 460 00:36:08,700 --> 00:36:13,029 they were genes, acted like genes, therefore probably were genes for the most part and 461 00:36:13,030 --> 00:36:16,320 at the end of the day that only accounted for about one and a half percent of the letters 462 00:36:16,320 --> 00:36:20,910 of the human genome and by -- although we still work on the exact number, it’s about 463 00:36:20,910 --> 00:36:24,830 20,000 genes; much lower than we anticipated, but that’s the number. 464 00:36:24,830 --> 00:36:30,220 What was interesting was that we knew there was a lot more functional stuff in there and 465 00:36:30,220 --> 00:36:34,470 that we were surprised by how little of the human genome actually coded for genes, only 466 00:36:34,470 --> 00:36:40,000 one and a half percent, and we knew there was amazing amount of other choreography that 467 00:36:40,000 --> 00:36:43,910 had to be at play to figure out where, when, and how much genes were going to get turned 468 00:36:43,910 --> 00:36:47,890 on and helping chromosomes function and all sorts of things and we knew we had to find 469 00:36:47,890 --> 00:36:53,940 that stuff, all that functional stuff outside of genes, and it was interesting because at 470 00:36:53,940 --> 00:36:58,870 the end of the day, you know, we would have felt we’d have really brilliant people on 471 00:36:58,870 --> 00:37:03,120 the planet, brilliant scientists to help us -- but at the end of the day we actually looked 472 00:37:03,120 --> 00:37:07,080 back in time to help be guided how we were going to figure out all the functional sequence 473 00:37:07,080 --> 00:37:10,460 of the human genome and probably one of the most inspirational figures that influenced 474 00:37:10,460 --> 00:37:13,860 us were not any of the people listed here, but actually someone that was listed even 475 00:37:13,860 --> 00:37:18,590 before here and that was Darwin, and it was sort of interesting how Darwin really came 476 00:37:18,590 --> 00:37:23,360 to be commonly discussed immediately when the Genome Project ended, because there were 477 00:37:23,360 --> 00:37:27,340 so many things that Darwin taught us in his writings, you know? 478 00:37:27,340 --> 00:37:31,300 One of his famous things that was at least attributed to him is, you know, ‘Not the 479 00:37:31,300 --> 00:37:34,570 strongest of the species that survives, nor the most intelligent, but it’s the one that’s 480 00:37:34,570 --> 00:37:39,800 most adaptable to change,’ and he hinted at the idea that something was going on through 481 00:37:39,800 --> 00:37:44,500 evolution, that something was being changed to have sort of species survive and adapt 482 00:37:44,500 --> 00:37:49,860 and eventually, you know, sort of deal with sort of the evolutionary progression. Of course 483 00:37:49,860 --> 00:37:54,590 we now know that stuff’s what’s -- is the DNA. That’s where it’s all at and 484 00:37:54,590 --> 00:37:59,540 that’s why a more contemporary scientist, genomicist, wrote, right around the time the 485 00:37:59,540 --> 00:38:03,009 Genome Project ended, that “for the last three and a half billion years, evolution 486 00:38:03,010 --> 00:38:07,920 has been taking notes,” and those notes are in the sequences of the genomes of all 487 00:38:07,920 --> 00:38:12,790 these creatures and so I mention Darwin, I mention that quote, because what happened 488 00:38:12,790 --> 00:38:18,720 when the Genome Project ended was a recognition that we needed to do lots of comparisons of 489 00:38:18,720 --> 00:38:23,319 our genome sequence with other creatures to better understand how our genome sequence 490 00:38:23,320 --> 00:38:29,580 works and we also recognize that we were just -- as a species, just one really, teeny little 491 00:38:29,580 --> 00:38:36,720 twig on a tree of life of great richness and that buried in those notebooks in the DNA 492 00:38:36,720 --> 00:38:40,040 sequence of these other creatures was lots of important information. 493 00:38:40,040 --> 00:38:42,870 So that’s the reason -- and many of you probably recognize that’s the reason why 494 00:38:42,870 --> 00:38:48,310 we went off and started sequencing lots of critters and their genomes and first we started 495 00:38:48,310 --> 00:38:52,290 with laboratory models, mice and rats and companion animals like dogs and our closest 496 00:38:52,290 --> 00:38:58,960 relatives like chimps, but in fact we needed to use the power of comparative sequence analysis 497 00:38:58,960 --> 00:39:02,640 and sampling more broadly across the tree -- why we started sequencing lots of other 498 00:39:02,640 --> 00:39:07,040 critters, selectively and strategically picked off of different trees. A [unintelligible] 499 00:39:07,040 --> 00:39:10,640 is here, originally it was like 25 species, then it went to 100 species, now we’re -- well 500 00:39:10,640 --> 00:39:16,720 over 200 species have been sequenced and we used all that rich data to basically start 501 00:39:16,720 --> 00:39:23,439 asking questions like what sequences in the human genome are conserved across all mammals 502 00:39:23,440 --> 00:39:26,640 or across all vertebrates or across all primates? 503 00:39:26,640 --> 00:39:30,379 Because if they’re conserved that heavily, they don’t change over that many years of 504 00:39:30,380 --> 00:39:34,640 evolution, they must be important, because evolution just has a way of going in there 505 00:39:34,640 --> 00:39:39,230 and wanting to sort of change things if it’s not important and so that was the rationale 506 00:39:39,230 --> 00:39:45,290 for moving beyond the human sequence and now sequencing many, many, many species and in 507 00:39:45,290 --> 00:39:50,990 fact what that gave us was remarkable insights about where in the human genome are the most 508 00:39:50,990 --> 00:39:56,720 conserved sequences through evolution, pointing to the sequences most likely to be functionally 509 00:39:56,720 --> 00:40:01,680 important, and in doing that you end up with about five to 10 percent of the human genome 510 00:40:01,680 --> 00:40:05,160 sequence -- five to 10 percent of the three billion letters -- are conserved across almost 511 00:40:05,160 --> 00:40:09,910 all mammals and they’re almost certainly functionally important, but that’s five 512 00:40:09,910 --> 00:40:16,009 to 10 percent of which the genes -- protein-coding genes, one and a half percent, is a minority. 513 00:40:16,010 --> 00:40:19,650 So around five to 10 percent must be functionally important at a minimum and only one and a 514 00:40:19,650 --> 00:40:24,480 half percent of that is protein-coding genes, which means the purple stuff is non-coding, 515 00:40:24,480 --> 00:40:30,130 functional sequences, in many cases conserved as aggressively throughout evolution as have 516 00:40:30,130 --> 00:40:36,350 been our protein-coding genes. Now what are these non-coding functional sequences doing? 517 00:40:36,350 --> 00:40:41,250 Well, they’re doing a lot. Probably the thing we know the most about is that they’re 518 00:40:41,250 --> 00:40:46,820 incredibly important in this complex choreography of gene regulation, all these elements, enhancers 519 00:40:46,820 --> 00:40:53,710 and promoters, and silencers, et cetera, et cetera. That’s all these sequence elements 520 00:40:53,710 --> 00:40:57,210 that are controlling these crazy things are going on of where, when, and how much genes 521 00:40:57,210 --> 00:41:01,230 are turned on, and so that’s some of that, but we also know that it’s not just all 522 00:41:01,230 --> 00:41:02,090 gene regulation. 523 00:41:02,090 --> 00:41:06,440 I mean there’s important sequences that help package up chromosomes, important sequences 524 00:41:06,440 --> 00:41:11,710 that help segregate chromosomes, important sequences that help replicate chromosomes 525 00:41:11,710 --> 00:41:18,320 and in fact we know for certain that there’s all sorts of complexity in RNA and we have 526 00:41:18,320 --> 00:41:24,280 really started to reveal a remarkable amount of function associated with non-coding RNA, 527 00:41:24,280 --> 00:41:27,890 something we didn’t even know about when the Genome Project began and now that’s 528 00:41:27,890 --> 00:41:32,890 very, very instrumentally important in many biological functions including gene regulation, 529 00:41:32,890 --> 00:41:37,109 and finally I would contend we should just recognize that there are things we just don’t 530 00:41:37,110 --> 00:41:41,260 know about in non-coding parts of the genome that are certainly functionally important. 531 00:41:41,260 --> 00:41:44,800 We just haven’t found them and figured them out yet and just nobody’s written about 532 00:41:44,800 --> 00:41:49,300 them in textbooks, but they’re coming and we’re going to figure this out and in fact 533 00:41:49,300 --> 00:41:51,420 that’s a very high-priority area. 534 00:41:51,420 --> 00:41:57,150 Oh, and the other thing that’s transpired in the last 13 years was a greater and greater 535 00:41:57,150 --> 00:42:03,340 and greater appreciation for yet another way that DNA functions, not by directly having 536 00:42:03,340 --> 00:42:10,070 the primary sequence confer function, but by having marks on our DNA put down that influence 537 00:42:10,070 --> 00:42:17,160 how DNA functions. These are called epigenomic marks, alluding to the whole world of epigenomics, 538 00:42:17,160 --> 00:42:22,609 and this involves chromatin and all -- and methylation and various modifications to DNA 539 00:42:22,610 --> 00:42:28,500 and it just turns out that the same methods that you can use for sequencing DNA can be 540 00:42:28,500 --> 00:42:33,600 adapted to read out the epigenomic marks in DNA. So now we have this incredibly strong 541 00:42:33,600 --> 00:42:37,910 ability to read out the second genomic code, if you will, and in fact I’m going to turn 542 00:42:37,910 --> 00:42:42,520 this whole topic over to Laura Elnitski, who’s going to on March 16 come and tell you much 543 00:42:42,520 --> 00:42:47,210 more about epigenomics and also about gene regulation, the topic I was just talking about. 544 00:42:47,210 --> 00:42:52,140 So a lot has happened in this arena but, boy, we also know a lot more needs to happen. It 545 00:42:52,140 --> 00:42:56,400 actually keeps getting more complicated because I would say in the last five years we’ve 546 00:42:56,400 --> 00:43:02,610 also realized that DNA is not just some innocent little linear molecule that lays out of the 547 00:43:02,610 --> 00:43:06,590 nucleus, but rather DNA also takes on, in the form of chromosomes, three-dimensional 548 00:43:06,590 --> 00:43:12,140 structures and that these three-dimensional structures also have some functional activities 549 00:43:12,140 --> 00:43:16,819 going on with different domains interacting and so the whole world of genomes in three 550 00:43:16,820 --> 00:43:21,020 dimensions is unfolding, again because of technological advances, that we can figure 551 00:43:21,020 --> 00:43:27,150 out what those interactions are and that’s also a very exciting area of active research. 552 00:43:27,150 --> 00:43:30,540 How do we do this? I mean how do we figure this out? What have we been doing to elucidate 553 00:43:30,540 --> 00:43:35,560 genome function? Well, it’s not simple. It will involve, I will tell you, multiple 554 00:43:35,560 --> 00:43:39,910 generations of scientists to help us fully elucidate the function of the human genome, 555 00:43:39,910 --> 00:43:44,589 and it also involves a number of different elements, if you will. I mean I will tell 556 00:43:44,590 --> 00:43:49,720 you to start off with that we recognize that like other efforts in genomics, Human Genome 557 00:43:49,720 --> 00:43:54,220 Project, 1,000 Genomes and so forth -- this was the -- we needed a team of people working 558 00:43:54,220 --> 00:43:59,189 on this to figure it out. That is why almost immediately after the Genome Project ended, 559 00:43:59,190 --> 00:44:04,610 we launched a program called the ENCODE Project. ENCODE stands for encyclopedia of DNA elements, 560 00:44:04,610 --> 00:44:10,060 aiming to catalog all of the functional elements in the human genome like those color highlights 561 00:44:10,060 --> 00:44:14,840 I showed earlier. It’s actually -- and it’s been going quite effectively; we’re about 562 00:44:14,840 --> 00:44:19,030 to start the next phase of ENCODE over the next year or so. It also had a sibling for 563 00:44:19,030 --> 00:44:25,310 a while called modENCODE for model organism ENCODE and what these efforts -- ENCODE, modENCODE 564 00:44:25,310 --> 00:44:28,820 -- basically do is create GPS-like views of the genome. 565 00:44:28,820 --> 00:44:33,640 So here you’re looking at one such view; this is just a view of the human genome that 566 00:44:33,640 --> 00:44:37,819 you can get to public -- on a publically-accessible website, where you zoom in and out and look 567 00:44:37,820 --> 00:44:42,610 at it and this is just an overwhelming amount of data generated by ENCODE. Some of it’s 568 00:44:42,610 --> 00:44:46,170 laboratory-based data, some of it’s experimental data, but -- I’m not going to go through 569 00:44:46,170 --> 00:44:50,150 it, but needless to say it has everything we could possibly imagine at the moment having 570 00:44:50,150 --> 00:44:54,380 information about -- that region of the genome, with respect to where are the genes, where 571 00:44:54,380 --> 00:44:59,840 are the conserved sequences, where’s transcription factors binding, what parts of DNA gets made 572 00:44:59,840 --> 00:45:05,740 into RNA, where’s the chromatin opening up and so forth, and that data really provides 573 00:45:05,740 --> 00:45:10,370 some insights about where’s the functional stuff and the challenge of course is synthesizing 574 00:45:10,370 --> 00:45:14,220 all this and really getting a very strict interpretation for every nucleotide, what 575 00:45:14,220 --> 00:45:18,589 the function is, but this is the kind of thing ENCODE has been doing and other efforts. There’s 576 00:45:18,590 --> 00:45:22,640 a big epigenomics effort that went on similarly, has contributed to this as well and will continue 577 00:45:22,640 --> 00:45:24,310 to do so. 578 00:45:24,310 --> 00:45:29,020 This has involved not just looking at human DNA, but recognizing that model organisms 579 00:45:29,020 --> 00:45:33,530 play a very important role in this; a very basic, science effort to understand genome 580 00:45:33,530 --> 00:45:39,360 function that sort of traverses everything from yeast all the way to humans. I’ll tell 581 00:45:39,360 --> 00:45:44,070 you that increasingly computational modeling comes in; this is not just about doing experiments 582 00:45:44,070 --> 00:45:49,440 and more and more sophisticated computational modeling methods are needed to fully elucidate 583 00:45:49,440 --> 00:45:53,620 how the human genome works and as I sort of alluded to earlier, there have been major 584 00:45:53,620 --> 00:45:57,720 and will continue to need to be major advances in technology development to really figure 585 00:45:57,720 --> 00:46:03,080 out all the nuances of the human genome and I will tell you that I -- John Quackenbush 586 00:46:03,080 --> 00:46:09,000 will be here April 27 talking about many aspects of gene expression and systems biology and 587 00:46:09,000 --> 00:46:12,770 I’m sure he will touch some of the things that I’m representing in this part, in particular 588 00:46:12,770 --> 00:46:14,670 in this slide, and so I thought I would point that out.] 589 00:46:14,670 --> 00:46:18,420 So where are we in interpreting the human genome? Well, I guess I could say a couple 590 00:46:18,420 --> 00:46:23,360 things: I sometimes say that it’s multi-generational; we’re just in the first generation. My grandchildren 591 00:46:23,360 --> 00:46:27,170 I would imagine will be interpreting -- helping to interpret the human genome sequence. You 592 00:46:27,170 --> 00:46:31,100 know, I sometimes say that, you know, when the Genome Project ended, we knew about this 593 00:46:31,100 --> 00:46:35,290 much about how the human genome works; now we know about this much and eventually we 594 00:46:35,290 --> 00:46:37,779 need to know about this much, so we still have a long way to go. We’ve made a lot 595 00:46:37,780 --> 00:46:41,670 of progress. Sometimes I just sort of joke and say we’re sort of at a SparkNotes view 596 00:46:41,670 --> 00:46:44,850 of the human genome, for those of you who know what SparkNotes is. You know, maybe it’ll 597 00:46:44,850 --> 00:46:48,040 get you ready for the test tomorrow but, you know, if it’s really a hard problem, we’re 598 00:46:48,040 --> 00:46:51,680 going to need more than just SparkNotes and -- but that’s why we will need to keep hammering 599 00:46:51,680 --> 00:46:55,649 at this. This is -- this is going to be a long haul part of the marathon, figuring out 600 00:46:55,650 --> 00:46:57,610 how the human genome works. 601 00:46:57,610 --> 00:47:02,700 Well, here we are. We have the sequence, we have our blueprint, we can easily and cheaply 602 00:47:02,700 --> 00:47:07,620 get at other sequences, we’re beginning to understand how our sequences differ among 603 00:47:07,620 --> 00:47:12,410 people and we are starting to get better and better insights about the function of the 604 00:47:12,410 --> 00:47:16,629 genome. It is time to start thinking about what we originally thought was going to be 605 00:47:16,630 --> 00:47:20,700 really valuable for genomics, and that is to be able to start applying our efforts to 606 00:47:20,700 --> 00:47:25,220 understand human disease and I would contend there have been significant advances, especially 607 00:47:25,220 --> 00:47:30,200 in the last 13 years, in unraveling the genomic basis of human disease and this certainly 608 00:47:30,200 --> 00:47:32,790 deserves the fifth part of the highlight. 609 00:47:32,790 --> 00:47:37,290 Now I will also point out and I’m going to programmatically just sort of mention that 610 00:47:37,290 --> 00:47:42,850 I think our institute has been very helpful at figuring out how to use genome sequencing 611 00:47:42,850 --> 00:47:49,650 to study human disease and that really grew out of our commitment to having our -- especially 612 00:47:49,650 --> 00:47:54,510 our extramural research efforts be sustaining us, keeping us at the cutting edge of genome 613 00:47:54,510 --> 00:48:00,460 analysis. So our largest part of our institute’s extramural portfolio is our -- called our 614 00:48:00,460 --> 00:48:04,890 Genome Sequencing Program, which has had a progression starting with the Genome Project. 615 00:48:04,890 --> 00:48:09,609 They were the groups that in the United States, in particular the NIH-funded parts -- that 616 00:48:09,610 --> 00:48:12,750 were heavily involved in generating the first sequence of the human genome and then moving 617 00:48:12,750 --> 00:48:17,060 on to help us figure out the sequence of these other genomes I alluded to and then working 618 00:48:17,060 --> 00:48:21,520 on things like the HapMap Project and the 1,000 Genomes that I’ve mentioned and then 619 00:48:21,520 --> 00:48:26,620 starting to focus on disease and initially working on cancer, which I’m going to have 620 00:48:26,620 --> 00:48:30,299 more to say about in a bit, but the -- a very well-known project called the Cancer Genome 621 00:48:30,300 --> 00:48:35,070 Atlas, which we did in partnership with the Cancer Institute, which is now in its last 622 00:48:35,070 --> 00:48:41,180 phase, but then in particular of late and now moving forward, focusing on rare and common 623 00:48:41,180 --> 00:48:46,819 diseases, simply asking the question: how can you scale up the use of genome sequencing 624 00:48:46,820 --> 00:48:51,320 to be able to figure out the genomic basis of rare and common diseases, and so let me 625 00:48:51,320 --> 00:48:56,500 just remind you, because this is really important to understand, the differences between rare 626 00:48:56,500 --> 00:49:00,390 and common diseases when it comes to the underlying genomic architecture. 627 00:49:00,390 --> 00:49:06,500 Well, so on the one hand we have rare diseases; these are diseases like sickle-cell and cystic 628 00:49:06,500 --> 00:49:11,190 fibrosis, Huntington’s disease; rare in the population but it turns out they’re 629 00:49:11,190 --> 00:49:17,770 quite simple, quite -- it’s an understatement, but in terms of simply involving mutations 630 00:49:17,770 --> 00:49:22,330 affecting a single gene. These are monogenic disorders, also referred to as Mendelian disorders 631 00:49:22,330 --> 00:49:29,370 -- after the famous geneticist Mendel, and here it is very clear there’s great potential. 632 00:49:29,370 --> 00:49:37,609 There’s over 7,000 known rare diseases and remarkably, we recognize that with the new 633 00:49:37,610 --> 00:49:42,570 sequencing methods, we can really accelerate the pace at which we can figure out the genomic 634 00:49:42,570 --> 00:49:44,050 basis of rare diseases. 635 00:49:44,050 --> 00:49:48,320 We’ve had a -- for several years and we just renewed the program -- a program called 636 00:49:48,320 --> 00:49:52,260 the Centers for Mendelian Genomics, which is a series of centers that truly is doing 637 00:49:52,260 --> 00:49:56,980 that, tackling these rare disorders in an industrialized fashion to try to figure out 638 00:49:56,980 --> 00:50:02,070 the underlying mutations in these genes and they’ve made great progress along with the 639 00:50:02,070 --> 00:50:05,590 other worldwide efforts, but you have to recognize there’s more progress to be made, which 640 00:50:05,590 --> 00:50:11,290 is why we have these centers working so hard. There are about 7,500 rare genetic diseases 641 00:50:11,290 --> 00:50:17,300 but we have found the gene, or the defective -- the mutated -- mutations underlying those 642 00:50:17,300 --> 00:50:24,000 diseases for about 4,300 of those. Forty-three hundred is a remarkable number considering 643 00:50:24,000 --> 00:50:29,530 that the day the Genome Project started, we only knew the genomic basis for 61 of those 644 00:50:29,530 --> 00:50:34,680 diseases. So in a quarter century we’ve gone from knowledge about 61 diseases, to 645 00:50:34,680 --> 00:50:39,029 knowledge -- molecular knowledge of about 4,300. That’s the good news. The challenging 646 00:50:39,030 --> 00:50:42,350 news is we want to finish this up; we want to get the next few thousand and that’s 647 00:50:42,350 --> 00:50:46,890 what this project is doing and so that’s been remarkable advance I think in the last 648 00:50:46,890 --> 00:50:49,020 quarter century with respect to rare diseases. 649 00:50:49,020 --> 00:50:53,380 Now what about the other class of diseases, common diseases, because common diseases are 650 00:50:53,380 --> 00:50:57,730 what you’re much more familiar with? You know, rare diseases are rare in the population; 651 00:50:57,730 --> 00:51:04,290 they’re certainly quite burdensome to families and patients, but in aggregate that’s not 652 00:51:04,290 --> 00:51:07,820 what accounts for most hospital visits, clinic visits, that’s not what fill hospitals and 653 00:51:07,820 --> 00:51:12,680 clinics. It’s common diseases that fill hospitals and clinics; this is hypertension, 654 00:51:12,680 --> 00:51:16,810 diabetes, autism, Alzheimer’s, cardiovascular disease and so forth. When they’re common 655 00:51:16,810 --> 00:51:20,950 in the population, the hard part is that they’re complicated. They’re complicated because 656 00:51:20,950 --> 00:51:24,049 it’s not just a single gene, in fact it’s not even necessarily genes, it could even 657 00:51:24,050 --> 00:51:28,740 -- because we actually believe a lot of it is non-coding functional sequences, it’s 658 00:51:28,740 --> 00:51:34,520 multiple, multiple variants we believe scattered throughout the genome, with what is typically 659 00:51:34,520 --> 00:51:38,770 a greater contribution of the environment, especially compared to rare diseases, and 660 00:51:38,770 --> 00:51:44,050 so we knew this was going to be really complicated and indeed, it certainly has proven to be 661 00:51:44,050 --> 00:51:46,370 pretty complicated. 662 00:51:46,370 --> 00:51:51,420 This of course was why there was a big push once we -- that’s the reason why we wanted 663 00:51:51,420 --> 00:51:57,210 to get all these variants cataloged, so that we could do studies to analyze these variants 664 00:51:57,210 --> 00:52:02,100 in thousands of people and we could do this across the whole genome, genome-wide, and 665 00:52:02,100 --> 00:52:07,180 we could try to associate known common variants with diseases like hypertension, diabetes, 666 00:52:07,180 --> 00:52:13,520 and so forth. That’s what gave rise to the genome-wide association study idea, GWAS, 667 00:52:13,520 --> 00:52:19,990 and GWAS was an idea that on paper everybody hoped would work and it did and it started 668 00:52:19,990 --> 00:52:25,620 in 2008, was the first published genome-wide association study, and since 2008 there’s 669 00:52:25,620 --> 00:52:31,569 been over 2,000 such studies that have come about and in fact there are now about 4,000 670 00:52:31,570 --> 00:52:36,700 places in the genome that we believe have a variant in them that are conferring the 671 00:52:36,700 --> 00:52:38,430 risk for getting the disease. 672 00:52:38,430 --> 00:52:43,080 The problem is it’s just a risk, it’s not an absolute -- usually these are absolutes 673 00:52:43,080 --> 00:52:48,830 -- these are not and it’s not telling you which variant it is; these GWAS studies only 674 00:52:48,830 --> 00:52:52,700 tell you what neighborhood they live in, they only tell you a region of a chromosome. There’s 675 00:52:52,700 --> 00:52:56,689 still lots and lots of variants there and so the good news is that we -- and again, 676 00:52:56,690 --> 00:53:01,230 we’ve gone from sort of not knowing how to decipher this complexity, to having some 677 00:53:01,230 --> 00:53:05,540 really good ideas of where we really need to hunt in greater detail, but there’s been 678 00:53:05,540 --> 00:53:09,910 only a very few examples where we’ve actually gotten it down to a very specific variant 679 00:53:09,910 --> 00:53:13,190 and known what that variant is doing and we need to do that thousands and thousands of 680 00:53:13,190 --> 00:53:16,300 times for all these very important disorders. 681 00:53:16,300 --> 00:53:19,490 So we need to -- actually what we’ve learned in the last five years in particular is we 682 00:53:19,490 --> 00:53:23,359 need bigger and bigger efforts. That’s why we actually have now turned the attention 683 00:53:23,360 --> 00:53:27,520 of our largest centers, in a new program which actually we just formally announced a few 684 00:53:27,520 --> 00:53:32,030 weeks ago called the Centers for Common Disease Genomics and these are centers that are going 685 00:53:32,030 --> 00:53:36,560 to teach us how to do this, we hope, along with other efforts in the world, where there 686 00:53:36,560 --> 00:53:40,820 are -- we now know that you don’t just -- we need to completely sequence the genomes of 687 00:53:40,820 --> 00:53:46,070 lots of people with a disease like hypertension or cardiovascular disease or autism and Alzheimer’s 688 00:53:46,070 --> 00:53:50,620 disease, diabetes, and then lots of controls and you need to probably do tens of thousands, 689 00:53:50,620 --> 00:53:56,700 we now are recognizing, and have very large datasets to analyze and then use very fancy 690 00:53:56,700 --> 00:54:00,529 statistical methods to tease out which of the variants that are actually the ones conferring 691 00:54:00,530 --> 00:54:05,510 risk. And so while we don’t have as much to report yet with common diseases like we 692 00:54:05,510 --> 00:54:09,240 do with rare diseases, I would just say we are on a trajectory I think that’s going 693 00:54:09,240 --> 00:54:12,430 to give us a lot of insights about -- strategically how we’re going to do this and hopefully 694 00:54:12,430 --> 00:54:16,940 we will get a lot of new insights over the next five and ten years. 695 00:54:16,940 --> 00:54:22,160 But I will tell you that all of these efforts, especially what’s going on now in efforts 696 00:54:22,160 --> 00:54:25,920 like this where literally tens of thousands of individuals’ genomes are going to be 697 00:54:25,920 --> 00:54:30,390 sequenced and analyzed, are going -- is just an immense amount of data and that’s on 698 00:54:30,390 --> 00:54:34,549 top of all the other data, of all the other projects I’ve been telling you about and 699 00:54:34,550 --> 00:54:40,590 -- oh, and by the way, I forgot to mention, before I go on: Dave -- with everything else, 700 00:54:40,590 --> 00:54:45,360 a lot more will be said about rare diseases when Dave Valle is here on April 13 and a 701 00:54:45,360 --> 00:54:49,340 lot more will be said about common diseases, in a much more sophisticated way than I did, 702 00:54:49,340 --> 00:54:54,110 by Karen Mohlke when she’s here on April 6, but what all of them will tell you is that 703 00:54:54,110 --> 00:54:58,710 -- and in fact all of you probably realize, is that the world we now live in, as these 704 00:54:58,710 --> 00:55:05,200 wonderful technologies get us more and more data, is that we have the circumstance where 705 00:55:05,200 --> 00:55:10,049 we are overwhelmed by the amount of data coming out of these sequencing instruments and -- in 706 00:55:10,050 --> 00:55:16,190 particular and other new technologies and this is sort of putting us, in fact, into 707 00:55:16,190 --> 00:55:19,920 a new circumstance where the bottleneck is not generating the information, the bottleneck 708 00:55:19,920 --> 00:55:27,470 is analyzing it, and so this bottleneck really is one where these fancy new methods for sequencing 709 00:55:27,470 --> 00:55:33,660 DNA are giving us a circumstance where reading out a genome sequence is not the hard part. 710 00:55:33,660 --> 00:55:39,290 You know, the hard part actually is sort of progressing on and then figuring out what 711 00:55:39,290 --> 00:55:43,590 to do with the information about the variants in our genomes. As you’re going to hear 712 00:55:43,590 --> 00:55:49,150 from other speakers, you know, there’s issues just around hardware and having enough capacity 713 00:55:49,150 --> 00:55:52,770 to store all this data. There’s issues of varying -- we need increasingly better and 714 00:55:52,770 --> 00:55:57,140 better software tools and of course we need a workforce that’s able to do all of this 715 00:55:57,140 --> 00:56:02,350 which is why some of you are here, is to help become that workforce, and that also has to 716 00:56:02,350 --> 00:56:06,360 include taking it to the final stage of knowing how to take information, write individual 717 00:56:06,360 --> 00:56:10,710 variants and knowing how relevant that is for individual patients. So as -- this is 718 00:56:10,710 --> 00:56:14,470 why Terra [spelled phonetically] and Andy in particular dedicate three lectures -- talk 719 00:56:14,470 --> 00:56:18,160 about the data analysis, the data science, the bioinformatics, whatever phrase you want 720 00:56:18,160 --> 00:56:23,450 to use of everything I’m describing, because data analysis is actually sort of a big part 721 00:56:23,450 --> 00:56:27,450 of all of this and in many ways is sort of the grand challenge that all of us are working 722 00:56:27,450 --> 00:56:29,770 on. 723 00:56:29,770 --> 00:56:35,450 So I’ve gone through five highlights of the past quarter century and then of course 724 00:56:35,450 --> 00:56:39,730 as a recognition I’ve said nothing about medical practice, as if maybe there wasn’t 725 00:56:39,730 --> 00:56:43,910 any highlights and probably a few years ago I would have been very limited in what I might 726 00:56:43,910 --> 00:56:48,700 have been able to say, but I actually think that this is worth putting as a sixth highlight 727 00:56:48,700 --> 00:56:52,960 of the first quarter century of genomics, because I really do believe there are vivid 728 00:56:52,960 --> 00:56:57,920 examples of genomic medicine that are just starting to emerge. It is the tip of the iceberg 729 00:56:57,920 --> 00:57:01,650 and there will be a lot more, but I think it is worth highlighting what some of these 730 00:57:01,650 --> 00:57:05,890 are, because I really believe that we are seeing genomic medicine come into focus in 731 00:57:05,890 --> 00:57:09,299 a fashion that actually -- even more exciting than when I spoke in this series a couple 732 00:57:09,300 --> 00:57:13,950 of years ago and so I thought I would just quickly go through what these highlights are 733 00:57:13,950 --> 00:57:20,910 because Bruce Korf will talk about it in much greater detail when he is here talking holistically 734 00:57:20,910 --> 00:57:25,500 about genomic medicine, but I thought in particular you may want to hear especially from my perspective 735 00:57:25,500 --> 00:57:30,100 -- as the director of the Genome Institute, what do I think are sort of the hot areas 736 00:57:30,100 --> 00:57:33,910 in genomics and what are some of the programs we’re doing to facilitate advances in those 737 00:57:33,910 --> 00:57:37,799 hot areas. And so I’m going to just sort of, again, go through highlights. I’m going 738 00:57:37,800 --> 00:57:41,960 to highlight five areas that I think are the hottest ones in genomic medicine. 739 00:57:41,960 --> 00:57:46,590 I’ll start with cancer because cancer is the hottest area in genomic medicine implementation. 740 00:57:46,590 --> 00:57:52,450 I don’t need to probably tell a sophisticated audience like this that cancer’s a disease 741 00:57:52,450 --> 00:57:56,980 of the genome and what happens in cancer is that mutations get picked up by normal cells 742 00:57:56,980 --> 00:58:02,560 and eventually make those cells grow out of control to become tumors. But those mutations 743 00:58:02,560 --> 00:58:07,250 are sitting in the genomes of these tumors and those genomes can be sequenced just like 744 00:58:07,250 --> 00:58:11,040 a normal cell’s genome can be sequenced. And with better and better methods for sequencing 745 00:58:11,040 --> 00:58:17,210 DNA we can open up these tumors’ blueprint, that’s genome, and read it out and begin 746 00:58:17,210 --> 00:58:21,300 to catalog all the sequence changes. And that’s why efforts like I mentioned, the Cancer Genome 747 00:58:21,300 --> 00:58:22,930 Atlas did exactly that. 748 00:58:22,930 --> 00:58:27,529 What that also can do then is start to give better and better information to diagnose 749 00:58:27,530 --> 00:58:33,340 cancer and perhaps to think about better ways to treat cancer. And there is a whole -- incredible 750 00:58:33,340 --> 00:58:38,420 areas that I couldn’t possible represent. I will say that in terms of actually changing 751 00:58:38,420 --> 00:58:41,880 the practice of medicine -- since I’m trained as a pathologist actually, I do recognize 752 00:58:41,880 --> 00:58:46,990 the diagnostic potential is here, you know, as one example of many, you know, for many, 753 00:58:46,990 --> 00:58:54,770 many decades most cancer diagnostics involved histopathology as the major tool. And so that 754 00:58:54,770 --> 00:59:00,190 will continue, but I already have seen and for some kinds of cancer that that histopathology 755 00:59:00,190 --> 00:59:06,220 is augmented by genomic signatures that in genomic profiles, if you will, of tumors that 756 00:59:06,220 --> 00:59:11,750 come out of machines like this and other machines. And it’s here and now. This is not hypothetical. 757 00:59:11,750 --> 00:59:15,540 It is absolutely here and now for some types of cancer. And if you don’t believe me, 758 00:59:15,540 --> 00:59:20,130 just watch television, go to websites, and so forth. You will find a website such as 759 00:59:20,130 --> 00:59:23,580 that shown on the top. You will find, and may have seen advertisements -- and I keep 760 00:59:23,580 --> 00:59:29,490 seeing them all the time, they’re increasing -- on television, whereby -- from prominent 761 00:59:29,490 --> 00:59:33,729 cancer treatment facilities talking about genomic this, genomic that, talking about 762 00:59:33,730 --> 00:59:38,050 the DNA of cancer care and so forth. This is mainstream. It’s used for marketing because 763 00:59:38,050 --> 00:59:43,150 genomics is absolutely here to stay with respect to cancer diagnostics and cancer treatment. 764 00:59:43,150 --> 00:59:45,490 It is the lowest hanging fruit. 765 00:59:45,490 --> 00:59:50,569 I think another low-hanging fruit is the world of pharmacogenomics -- pharmacology meeting 766 00:59:50,570 --> 00:59:57,940 genomics -- recognizing that there is a reason why all of us respond differently. We respond 767 00:59:57,940 --> 01:00:03,160 differently to everything. That’s -- I’m the guy on the left, okay? That’s me. My 768 01:00:03,160 --> 01:00:07,339 children are like this, right? Actually maybe it was a bad example because that might imply 769 01:00:07,340 --> 01:00:11,210 genetics that’s not in play here, so maybe that’s not a good example. But we all respond 770 01:00:11,210 --> 01:00:14,050 differently to everything. What I really am getting at is not rollercoasters. What I’m 771 01:00:14,050 --> 01:00:20,110 really getting at is medications. Every medication in this pharmacy, in this hospital, in CVS, 772 01:00:20,110 --> 01:00:23,620 Walgreens, every one of those medications work, they just don’t work in everyone. 773 01:00:23,620 --> 01:00:28,900 But, and in fact, they are really -- often, often don’t work. In fact, “Nature” 774 01:00:28,900 --> 01:00:32,820 had an article about this which I thought was very interesting just recent talking about 775 01:00:32,820 --> 01:00:37,990 the imprecise medicine. And here are sort of 10 very commonly prescribed medicines. 776 01:00:37,990 --> 01:00:42,479 And the person in blue in each case is the person where the medicine works. And the people 777 01:00:42,480 --> 01:00:46,630 in red are the number of people proportionally where the medicine doesn’t work. 778 01:00:46,630 --> 01:00:51,070 Well, there’s a lot of reasons why the medicines don’t work, but a good part of that is different 779 01:00:51,070 --> 01:00:57,730 ways that we metabolize drugs or how it affects us physiologically, much of which is due to 780 01:00:57,730 --> 01:01:02,670 variations in our genomes. And so the idea underlying all of them -- and we are learning 781 01:01:02,670 --> 01:01:08,140 more and more about this -- is that we can take individuals with the same diagnose, but 782 01:01:08,140 --> 01:01:12,640 do genomic profiling of them, get genomic information on them, and figure out who has 783 01:01:12,640 --> 01:01:16,730 variants that are going to make you a good responder or a not-so-good responder, or even 784 01:01:16,730 --> 01:01:21,820 be a bad responder. And do that before you decide on what medication to give a person 785 01:01:21,820 --> 01:01:26,990 or what the dosage might be, and so forth. So pharmacogenomics here and now recognized 786 01:01:26,990 --> 01:01:31,810 widely as something and will increase substantially over the next decade, which is why Andy and 787 01:01:31,810 --> 01:01:37,509 Tara have Howard McCloud coming here, a regular in this series, to talk exclusively about 788 01:01:37,510 --> 01:01:40,750 pharmacogenomics on May 7. 789 01:01:40,750 --> 01:01:45,280 Third highlight here and now, actually this building’s a great place to talk about it 790 01:01:45,280 --> 01:01:52,690 in, is the use of genomics to do rare genetic disease diagnostics. The notion of having 791 01:01:52,690 --> 01:01:57,040 disease strike from nowhere, individuals with conditions that nobody seems to figure out 792 01:01:57,040 --> 01:02:02,980 what’s wrong them, but for a -- in many cases these individuals have major amounts 793 01:02:02,980 --> 01:02:07,130 of resources spent trying to diagnose them, that the idea of just sequencing the genome 794 01:02:07,130 --> 01:02:11,460 as perhaps giving a clue, just makes a whole lot of sense. So as “Nature” pointed out 795 01:02:11,460 --> 01:02:15,270 in this article, disorders not readily explained by standard tests can sometimes be diagnosed 796 01:02:15,270 --> 01:02:19,500 through genome sequencing analysis. And sometimes -- it’s about question to a third of the 797 01:02:19,500 --> 01:02:24,290 time by today’s methods -- we’ll find out what the diagnosis is by sequencing a 798 01:02:24,290 --> 01:02:25,140 genome. 799 01:02:25,140 --> 01:02:30,400 And the notion of undiagnosed diseases, undiagnosed conditions really, has come to the fore -- actually, 800 01:02:30,400 --> 01:02:34,740 deserving a lot of credit, activities taking place right here in this building. You know, 801 01:02:34,740 --> 01:02:39,529 patients on these long diagnostic odysseys, going from doctor to doctor, medical center 802 01:02:39,530 --> 01:02:43,270 to medical center, nobody can figure out what’s wrong with them. Shown here is Bill Gall [spelled 803 01:02:43,270 --> 01:02:46,090 phonetically], who’s our clinical director at our institute, but also is the leader of 804 01:02:46,090 --> 01:02:50,430 something called the Undiagnosed Diseases Program, which took place right here, started 805 01:02:50,430 --> 01:02:55,140 here in this -- the NIH Clinical Center and really reduced to practice the idea of bringing 806 01:02:55,140 --> 01:03:00,940 these patients in and having a rigorous clinical evaluation in addition to a genomic analysis 807 01:03:00,940 --> 01:03:05,220 to try to see if you can figure out -- get a diagnosis, not that that yields a diagnosis 808 01:03:05,220 --> 01:03:10,830 every time, but it does quite frequently and it has been a remarkably successful program, 809 01:03:10,830 --> 01:03:15,830 and is here and now and is mainstream and, in fact, has just been expanded. Recognizing 810 01:03:15,830 --> 01:03:21,560 its success, it is now a nationalized program. NIH has its pivotal role right here, but we 811 01:03:21,560 --> 01:03:26,230 now have established through a common fund program called the Undiagnosed Diseases Network, 812 01:03:26,230 --> 01:03:30,780 a series of other sites in the country who are doing similar work, as well. Here and 813 01:03:30,780 --> 01:03:38,220 now, genome analysis as part of diagnostics for rare, often undiagnosed conditions. 814 01:03:38,220 --> 01:03:43,609 Also here and now, in one case and in another case being contemplated certainly, is the 815 01:03:43,610 --> 01:03:47,120 genomics of -- the package as I understand is called Genomics of Pregnancy. It’s actually 816 01:03:47,120 --> 01:03:52,980 two stories in one. But genomics plays a big role now in pregnancy. There’s a sign -- it’s 817 01:03:52,980 --> 01:03:56,340 a sign that I think is important to think about. Let me tell you each of these. The 818 01:03:56,340 --> 01:04:03,300 here and now is this: you know, we’ve doing prenatal testing for many decades actually, 819 01:04:03,300 --> 01:04:07,930 but we had to access fetal DNA. The way you access fetal DNA traditionally is through 820 01:04:07,930 --> 01:04:13,850 invasive procedures like amniocentesis chorionic villus sampling. But with the new methods 821 01:04:13,850 --> 01:04:19,799 for sequencing DNA are so exquisitely sensitive that now we have -- the community has figured 822 01:04:19,800 --> 01:04:26,640 out how to basically access that DNA that gets shed into the maternal blood stream from 823 01:04:26,640 --> 01:04:32,670 the fetus. And that cell-free DNA can be accessed by a simple blood draw, a relatively non-invasive 824 01:04:32,670 --> 01:04:38,540 procedure. And so, the idea of doing non-invasive, prenatal genome sequencing and genome analysis 825 01:04:38,540 --> 01:04:42,070 is here and now. I mean, there’s lots of literature about this. In fact, it’s winning 826 01:04:42,070 --> 01:04:47,010 lots of prizes over the past few years because it is changing the face of prenatal diagnostics 827 01:04:47,010 --> 01:04:50,210 and written about in the popular press, as well. 828 01:04:50,210 --> 01:04:53,890 Recent data came out that’s just, I think, breathtaking. You think, “Oh, okay, there’s 829 01:04:53,890 --> 01:04:57,200 a few people doing this. They get this blood drawn, they get this -- no, no, not a few 830 01:04:57,200 --> 01:05:01,899 people worldwide, millions of people in the world. In an article last year, you can see 831 01:05:01,900 --> 01:05:07,910 the rise in the use of non-invasive methods involving genome analysis by simple blood 832 01:05:07,910 --> 01:05:12,089 draws. You can self-read DNA of the fetus. You can see an aggregate -- it’s well over 833 01:05:12,090 --> 01:05:15,840 a million and well -- and [unintelligible] suspected to be well over a million in 2015. 834 01:05:15,840 --> 01:05:18,850 I haven’t seen the numbers yet. Remarkable. 835 01:05:18,850 --> 01:05:23,339 Here and now is actually reducing amniocentesis chorionic villus sampling substantially as 836 01:05:23,340 --> 01:05:31,100 a result. Here and now prenatal testing using genome sequencing. Not so here and now is 837 01:05:31,100 --> 01:05:36,819 this notion of the other end of pregnancy you get a baby. You have a newborn. And “Time” 838 01:05:36,820 --> 01:05:40,770 magazine thinks that by 2025, everyone’s going to get their DNA mapped. I think they 839 01:05:40,770 --> 01:05:45,180 meant DNA sequenced, but I’m not so sure. And I think we want to think about that. And 840 01:05:45,180 --> 01:05:52,000 it’s an interesting notion but it brings in a lot of logistical concepts and challenges 841 01:05:52,000 --> 01:05:55,380 and certainly a lot of ethical ones, as well. Do we want every child sequenced at birth, 842 01:05:55,380 --> 01:05:59,410 have their genomic information carried with them maybe in their electronic medicine record 843 01:05:59,410 --> 01:06:01,690 for life, not clear. 844 01:06:01,690 --> 01:06:06,070 So we actually are studying this. We actually set up a program with the Child Health Institute 845 01:06:06,070 --> 01:06:09,970 to study this, to get a research foundation to think about these things. We have a series 846 01:06:09,970 --> 01:06:14,129 of sites that are now sequencing the genomes of newborns and asking how does it change 847 01:06:14,130 --> 01:06:17,570 their care. And we’ll learn a lot over the next five years. 848 01:06:17,570 --> 01:06:22,450 I will highlight one of these studies, in particular. One of these sites is dealing 849 01:06:22,450 --> 01:06:28,740 with not healthy newborns. And here “Nature” talked about an article that an investigator 850 01:06:28,740 --> 01:06:33,450 is doing, which I think is remarkable, Stephen Kingsmore, where has basically take acutely 851 01:06:33,450 --> 01:06:38,850 ill children in the NICU -- newborns in the NICU -- where the doctors have basically -- have 852 01:06:38,850 --> 01:06:42,270 no idea what’s wrong with the child, know the child will die within a matter of days, 853 01:06:42,270 --> 01:06:45,810 and simply don’t know what’s wrong with them. And he’s reduced to practice the idea 854 01:06:45,810 --> 01:06:50,029 of getting a small amount of blood and sequence their genomes in less than a day and getting 855 01:06:50,030 --> 01:06:53,770 information that in some cases, not always -- but in some cases, gives insights about 856 01:06:53,770 --> 01:06:57,840 what’s wrong, in some cases saving the children, saving those newborns. And in fact, I think 857 01:06:57,840 --> 01:07:02,950 this is also going to become more commonplace for acutely ill children in the NICU where 858 01:07:02,950 --> 01:07:05,870 they simply don’t know what’s wrong with that child to quickly try to get a genome 859 01:07:05,870 --> 01:07:08,770 sequence. So another here and now. 860 01:07:08,770 --> 01:07:15,620 The last hot area is hot not because it’s solved, but because it’s really important. 861 01:07:15,620 --> 01:07:20,690 And it relates to the development of information systems that are connecting our knowledge 862 01:07:20,690 --> 01:07:25,680 of variants to their clinical relevance. And I really want to emphasize that generating 863 01:07:25,680 --> 01:07:30,490 a human genome sequence today is almost trivial. That is just not -- it’s not hard to read 864 01:07:30,490 --> 01:07:36,220 out the six billion Gs, As, Ts, and Cs of a given patient. What is really hard is then 865 01:07:36,220 --> 01:07:41,819 taking those six billion letters and rounding on the patient the next morning and having 866 01:07:41,820 --> 01:07:47,030 any clue what those variants mean. I mean, we just don’t know -- the vast, vast, vast 867 01:07:47,030 --> 01:07:50,920 majority of those variants we have no idea if they’re clinically relevant or not. 868 01:07:50,920 --> 01:07:54,730 We need to fix that. There is a disconnect even between what we know in the scientific 869 01:07:54,730 --> 01:07:59,380 literature and what a busy health care professional is actually doing in terms of trying to manage 870 01:07:59,380 --> 01:08:04,300 the care of patients. So we believe that we need to create much more robust -- that’s 871 01:08:04,300 --> 01:08:09,130 why it’s hot -- we need to create this acutely clinical genomics information systems, probably 872 01:08:09,130 --> 01:08:13,390 ones that integrate nicely with electronic health records and also ones that deliver 873 01:08:13,390 --> 01:08:19,778 very clear guidelines to health care professionals in a very simple way because they’re going 874 01:08:19,779 --> 01:08:24,290 to be look this up on these kinds of devices and the busy workflow of a nurse, of a pharmacist, 875 01:08:24,290 --> 01:08:28,859 of a physician’s assistant, of a physician. And that needs to be done in a very robust 876 01:08:28,859 --> 01:08:32,620 fashion. The truth, though, is it doesn’t exist. So we’ve put together a research 877 01:08:32,620 --> 01:08:37,000 network called the Clinical Genome Resource or ClinGen, which you can read about in this 878 01:08:37,000 --> 01:08:42,500 paper or look at this website, which is basically trying to scientifically figure out how do 879 01:08:42,500 --> 01:08:46,839 you build the knowledge base that could then be used by busy health care professionals? 880 01:08:46,839 --> 01:08:49,080 And so we’re at early stages. We’re not even building it yet, we’re just trying 881 01:08:49,080 --> 01:08:54,568 to figure out how do you take this explosion of literature about variants and which ones 882 01:08:54,569 --> 01:08:58,310 are medically important and which ones do you act one, which ones do you not, and reduce 883 01:08:58,310 --> 01:09:02,500 it to something that can be looked up quickly so that you can have the patient management 884 01:09:02,500 --> 01:09:07,939 be done efficiently with knowledge of that genomic information. So that’s something 885 01:09:07,939 --> 01:09:12,009 to look for, but it’s -- we’ve got a long way to go but we’re trying to facilitate 886 01:09:12,010 --> 01:09:12,480 it. 887 01:09:12,479 --> 01:09:15,450 So I want to transition before I spend the last 15 minutes talking about something else 888 01:09:15,450 --> 01:09:21,309 and just point out that I’ve just described to you sort of a romp, if you will, through, 889 01:09:21,310 --> 01:09:25,230 you know, a progression over the last quarter century that just started with saying this 890 01:09:25,229 --> 01:09:29,849 is audacious is just getting to sequence the human genome. But now, I was actually thinking 891 01:09:29,850 --> 01:09:33,069 about how are we actually going to use this information for clinical management like I 892 01:09:33,069 --> 01:09:36,640 described in the last slide? And it actually is more complicated than that because what 893 01:09:36,640 --> 01:09:40,230 we have found as a community of scientists that were mostly basic scientists thinking 894 01:09:40,229 --> 01:09:46,669 about genome structure and function and evolution, now we’re confronting the ecosystem of medicine. 895 01:09:46,670 --> 01:09:50,370 And the genomic medicine ecosystem, as it’s turning out, is really complicated because 896 01:09:50,370 --> 01:09:54,830 anything you go to change medical practice, you start touching lots of things that are 897 01:09:54,830 --> 01:09:58,670 really complicated. And that ecosystem is not healthy unless you think about all the 898 01:09:58,670 --> 01:09:58,890 things. 899 01:09:58,890 --> 01:10:03,830 So what do I mean by this? Well just think about health care delivery and reimbursements 900 01:10:03,830 --> 01:10:09,990 and all the aspects of changing clinical practice. It’s like this -- it’s really complicated. 901 01:10:09,990 --> 01:10:12,790 Parts of our institute, parts of the community are dealing with this. I’m not going to 902 01:10:12,790 --> 01:10:16,380 talk about, I ‘m just going to point out to you that it’s now becoming much bigger 903 01:10:16,380 --> 01:10:19,860 than we ever thought it was when we were just thinking about just the genome. You know, 904 01:10:19,860 --> 01:10:25,070 there’s other aspects of this related to also education and genomic literacy. You know, 905 01:10:25,070 --> 01:10:30,840 we need -- the language of genomics will be spoken as part of medical care. We need a 906 01:10:30,840 --> 01:10:35,050 literate public -- patients to go in. And you can see here is starting to train the 907 01:10:35,050 --> 01:10:39,380 next generation of what this is all about. We also have a whole profession out there 908 01:10:39,380 --> 01:10:44,150 of physicians and physician’s assistants, and pharmacists, and nurses. They all need 909 01:10:44,150 --> 01:10:47,820 to be literate in genomics. And how do you do that when they’re at mid-career, not 910 01:10:47,820 --> 01:10:50,960 just when you’re getting them when they’re being trained. And we’re thinking about 911 01:10:50,960 --> 01:10:55,250 that. And these have huge complexities. Oh and, of course, we never thought about this 912 01:10:55,250 --> 01:10:59,100 stuff when the Genome Project began, but, you know, there’s a lot of regulatory oversight 913 01:10:59,100 --> 01:11:04,070 associated with any practice -- any aspect of practicing medicine. And where genomics 914 01:11:04,070 --> 01:11:07,420 meets that regulatory oversight, there’s a lot of stories and sub-stories there that 915 01:11:07,420 --> 01:11:08,190 we’re dealing with. 916 01:11:08,190 --> 01:11:11,160 So I’m just mentioning these, not that I’m going to talk about them, but just to point 917 01:11:11,160 --> 01:11:16,059 out that it really is complicated and sometimes it’s almost daunting and overwhelming. So 918 01:11:16,060 --> 01:11:20,380 as, you know, we draw nice graphics that get us over to actually changing the practice 919 01:11:20,380 --> 01:11:25,010 of medicine, you know, there’s new surprises along the way and new mountains that we have 920 01:11:25,010 --> 01:11:29,770 to climb. And I will tell you at times it can get very exhausting because all of a sudden 921 01:11:29,770 --> 01:11:33,240 we’re dealing with the complexities of health care delivery on top of everything else and 922 01:11:33,240 --> 01:11:38,360 you can get rather pessimistic at times. But as I transition to my last topic, I do just 923 01:11:38,360 --> 01:11:42,910 want to point out I like this quote because sometimes when I show this graphic people 924 01:11:42,910 --> 01:11:46,680 say, “Oh my gosh, you’re never going to do this.” And I just think that a pessimist 925 01:11:46,680 --> 01:11:51,040 would have that view because a pessimist sees the difficulty in every opportunity. I think 926 01:11:51,040 --> 01:11:55,550 our community of genomecists, you know, we see this progression and we see the ecosystem 927 01:11:55,550 --> 01:11:59,230 and yes, it is daunting, but we don’t get -- we don’t think of it as pessimist, we 928 01:11:59,230 --> 01:12:02,790 see it as this is a great opportunity. And I think this why we keep pushing forward to 929 01:12:02,790 --> 01:12:07,960 see this progression become a reality, even when it really gets complicated. 930 01:12:07,960 --> 01:12:11,940 So let me transition now. I’m going to spend the last 15 minutes on one last topic and 931 01:12:11,940 --> 01:12:16,049 I think it is -- because it will talk a little bit about the future. And the future is a 932 01:12:16,050 --> 01:12:21,590 reflection, I think, in many ways of the change of what we have seen in genomics, not about 933 01:12:21,590 --> 01:12:25,730 the science that I have described to you for the last little over an hour, but even just 934 01:12:25,730 --> 01:12:29,910 the relevance of genomics. It touches on even the ecosystem aspect I was just talking about 935 01:12:29,910 --> 01:12:35,030 because when genomics started, you know, 25, 30 years ago, this -- it was a discipline 936 01:12:35,030 --> 01:12:39,340 just involving biomedical researchers. You know, I was one of those geeks just worrying 937 01:12:39,340 --> 01:12:43,840 about, you know, mapping and sequencing DNA. I think there was a pivotal transition when 938 01:12:43,840 --> 01:12:48,690 the Genome Project ended as we recruited health care professionals to start to work with us 939 01:12:48,690 --> 01:12:51,719 to think about how genomics might change. And we’re starting to see the fruits of 940 01:12:51,720 --> 01:12:56,430 that. And the hot areas are real because health professionals are getting involved in thinking 941 01:12:56,430 --> 01:12:57,620 about how to use genomics. 942 01:12:57,620 --> 01:13:03,019 But I think the real change that’s going on how is that patients are coming relevant 943 01:13:03,020 --> 01:13:09,320 in this conversation, and therefore friends and relatives of patients because genomics 944 01:13:09,320 --> 01:13:14,290 is becoming part of the language of cancer care, of pharmacogenomics, of rare diseases, 945 01:13:14,290 --> 01:13:18,180 and so forth, and you see it all the time in the news, on advertisements, in newspapers, 946 01:13:18,180 --> 01:13:24,060 and so forth. This is part of society now. And there’s a lot of issues that become 947 01:13:24,060 --> 01:13:28,010 relevant. I touched on some of them, like education. But there’s a lot of issues even 948 01:13:28,010 --> 01:13:33,630 around public health. And so, we think a lot about the societal implications and societal 949 01:13:33,630 --> 01:13:39,050 complications of genomics and think about some of the public health aspects of it, which 950 01:13:39,050 --> 01:13:42,890 is why Colleen McBride was invited to come here on March 23. She’s going to talk about 951 01:13:42,890 --> 01:13:45,210 more -- about some of these things. 952 01:13:45,210 --> 01:13:50,920 But because this is becoming so relevant for all of us and seeing the great potential for 953 01:13:50,920 --> 01:13:55,240 this, I did think -- I thought -- I would just use the last minutes to just update you 954 01:13:55,240 --> 01:13:59,260 about some breaking news. It’s not that breaking anymore but there’s a lot happening 955 01:13:59,260 --> 01:14:04,210 here. It was particularly breaking when it started in June of 2014, because involved 956 01:14:04,210 --> 01:14:10,660 this guy. I hope all of you know who this man is. And besides being President of the 957 01:14:10,660 --> 01:14:15,769 United States, this guy really likes science and he really likes genomics, I’m proud 958 01:14:15,770 --> 01:14:20,520 to say. And in June of 2014, he started some conversations actually with Francis Collins, 959 01:14:20,520 --> 01:14:26,330 our director, around the idea of maybe launching a big project near the tail end of his Presidency 960 01:14:26,330 --> 01:14:32,480 that might involve this vision of genomics, genomic medicine, individualizing patient 961 01:14:32,480 --> 01:14:36,080 care, and so forth, because he thought it really could have great impact on the future. 962 01:14:36,080 --> 01:14:40,100 And he wanted to see what he could do in the phases of his Presidency. 963 01:14:40,100 --> 01:14:45,620 Those conversations evolved in the summer of 2014, eventually getting framed around 964 01:14:45,620 --> 01:14:51,480 the concept of precision medicine. But precision medicine really sort of goes a step beyond 965 01:14:51,480 --> 01:14:56,440 genomic medicine. Genomic medicine would just be the Gs, As, Ts, and Cs -- showed here what 966 01:14:56,440 --> 01:15:00,849 precision medicine is being more precise by starting to account for things like environmental 967 01:15:00,850 --> 01:15:06,000 exposure, lifestyle, diet, and things like that, other aspects of our life that we might 968 01:15:06,000 --> 01:15:10,910 be able to use as information to be more precise for medical care. It is just a broader context 969 01:15:10,910 --> 01:15:15,200 for individualizing medical care to advance human health. 970 01:15:15,200 --> 01:15:21,290 And what the President saw as a great potential and aided by people who he spoke to, began 971 01:15:21,290 --> 01:15:26,210 to get a great appreciation for the idea that, you know, today we really do most of our medical 972 01:15:26,210 --> 01:15:29,920 care based on the expected results of the average patient. Almost everything we do is 973 01:15:29,920 --> 01:15:34,560 based on the average patient. But the world is changing and there could be a tomorrow 974 01:15:34,560 --> 01:15:39,000 where we could be more precise if we would only account for individual genomic differences, 975 01:15:39,000 --> 01:15:45,140 environmental differences, lifestyle differences, and have that as a way to be more precise 976 01:15:45,140 --> 01:15:49,190 in preventing and treating diseases. And the President really just wanted to know how could 977 01:15:49,190 --> 01:15:51,480 he get from today to tomorrow. 978 01:15:51,480 --> 01:15:57,980 And so a series of strategic planning efforts went on, and actually quite small, but important 979 01:15:57,980 --> 01:16:02,250 over the small numbers of people because the President wanted this quiet. It was during 980 01:16:02,250 --> 01:16:07,230 the summer of 2014, leading to sort of plans that emerged here at NIH and other parts of 981 01:16:07,230 --> 01:16:12,240 the federal government. In the fall of 2014, the President was presented this plan. This 982 01:16:12,240 --> 01:16:15,900 is actually a picture from that meeting -- you see Francis here -- other very important people 983 01:16:15,900 --> 01:16:20,940 I won’t go through. The meeting took place in October of 2014, where this plan was discussed 984 01:16:20,940 --> 01:16:24,980 and then the President got fully behind it, announced it in the State of the Union address 985 01:16:24,980 --> 01:16:30,559 in early 2015, and then formally announced here and shown here in these pictures from 986 01:16:30,560 --> 01:16:35,060 the East Room of the White House. These are photographs I got to take sitting center and 987 01:16:35,060 --> 01:16:40,460 fairly close to the podium when the President announced January 2015, the launch of this 988 01:16:40,460 --> 01:16:42,640 thing called the Precision Medicine Initiative. 989 01:16:42,640 --> 01:16:47,780 At the exact time that the President announced this in January of 2015, the “New England 990 01:16:47,780 --> 01:16:51,559 Journal” published this paper describing -- it’s actually the only scientific paper 991 01:16:51,560 --> 01:16:56,360 officially coming out describing the general framing of the Precision Medicine Initiative 992 01:16:56,360 --> 01:17:00,830 by Francis Collins and then-NCI Director Harold Varmus. If you haven’t read this I encourage 993 01:17:00,830 --> 01:17:05,170 you to read it. Harold was an author in part because there was going to be a major cancer 994 01:17:05,170 --> 01:17:09,330 genomics along the lines of what I said, part of the Precision Medicine Initiative. I’m 995 01:17:09,330 --> 01:17:13,290 not going to describe that now. I just thought I would briefly describe the other major element 996 01:17:13,290 --> 01:17:17,430 you’ll be hearing a lot about in the coming -- next coming months, actually hopefully 997 01:17:17,430 --> 01:17:19,630 the coming years and even decades. 998 01:17:19,630 --> 01:17:24,040 And that relates to the Precision Medicine Initiative’s launching of a U.S. national 999 01:17:24,040 --> 01:17:30,680 research cohort. The idea is to collect -- recruit and enlist millions of people, at least a 1000 01:17:30,680 --> 01:17:36,100 million, hopefully more, U.S. volunteers who will agree to participate in this hopefully 1001 01:17:36,100 --> 01:17:41,690 multi-decade project and program where the participants are going to share genomic data, 1002 01:17:41,690 --> 01:17:45,309 lifestyle information, biological samples, all of this will be linked to their electronic 1003 01:17:45,310 --> 01:17:51,850 health records. And this is a big kind of project that not only aims to do incredible 1004 01:17:51,850 --> 01:17:56,620 studies, but also to forge new models for how science is done. The idea is to do this 1005 01:17:56,620 --> 01:18:03,330 in a fashion that fully engages the participants, shares the data very openly in a very genomics-like 1006 01:18:03,330 --> 01:18:07,710 way of having data sharing, and also to of course make sure that all their privacy is 1007 01:18:07,710 --> 01:18:10,010 adequately protected. 1008 01:18:10,010 --> 01:18:14,790 The notion of having such a big program of involving many years studying lots and lots 1009 01:18:14,790 --> 01:18:20,230 of people actually isn’t brand new. None other than Francis Collins when he had the 1010 01:18:20,230 --> 01:18:25,559 job I currently have about a decade ago, actually called for this in a commentary that he wrote 1011 01:18:25,560 --> 01:18:30,590 in 2004. So you may wonder, well why did we bring back to the President a decade later 1012 01:18:30,590 --> 01:18:34,540 an idea that was a decade old that never went anywhere because it turned out when this got 1013 01:18:34,540 --> 01:18:38,730 proposed in 2004, it just didn’t get any traction, in part because it was a little 1014 01:18:38,730 --> 01:18:43,450 too early and it was a little too expensive. But the world had changed in the intervening 1015 01:18:43,450 --> 01:18:48,450 decade, which is why we brought a -- new version of this was brought back to the President 1016 01:18:48,450 --> 01:18:52,650 in 2014, why it got traction then. 1017 01:18:52,650 --> 01:18:56,740 And let me just briefly tell you why the world has changed in the last 10 years and it very 1018 01:18:56,740 --> 01:19:01,160 much overlaps with some of the things I’ve described earlier. You know, compared to a 1019 01:19:01,160 --> 01:19:06,040 decade ago genomics has changed. I mean, think about all the things I’ve described, the 1020 01:19:06,040 --> 01:19:09,490 cost of sequencing genomics or understanding the genome, understanding the variants and 1021 01:19:09,490 --> 01:19:15,469 so forth. So genomics: breathtaking changes in the last decade and that sort of has been 1022 01:19:15,470 --> 01:19:19,770 very influential for the launch of the Precision Medicine Initiative. But there’s other areas 1023 01:19:19,770 --> 01:19:23,580 -- electronic health records. Electronic health records are critical for what’s going to 1024 01:19:23,580 --> 01:19:29,340 be done to capture this information electronically. But a decade ago it was only about 20 of health 1025 01:19:29,340 --> 01:19:34,280 care professionals in settings had electronic health records. That’s why you couldn’t 1026 01:19:34,280 --> 01:19:37,400 -- what didn’t exist -- the infrastructure didn’t exist in 80 percent of places to 1027 01:19:37,400 --> 01:19:42,059 collect the information. Now that figure is over 95 percent in the U.S. of health care 1028 01:19:42,060 --> 01:19:42,840 sites. 1029 01:19:42,840 --> 01:19:48,980 And then meanwhile we have done a lot to learn how to marry genomic information with electronic 1030 01:19:48,980 --> 01:19:53,459 record information and other information feeding with electronic health records. We have a 1031 01:19:53,460 --> 01:19:57,690 program that’s been going on since 2007, called eMERGE for Electronic Medical Records 1032 01:19:57,690 --> 01:20:02,460 and Genomics, which has really taught us a tremendous amount of how we might be able 1033 01:20:02,460 --> 01:20:07,700 to capitalize on genomic information, electronic medical record information, and other information 1034 01:20:07,700 --> 01:20:12,650 in electronic medical records. And that has served as, in some ways, a pilot for what’s 1035 01:20:12,650 --> 01:20:15,879 being envisioned for the Precision Medicine Initiative. 1036 01:20:15,880 --> 01:20:19,560 But meanwhile, what the President also -- and the Congress -- by the way who’s now funding 1037 01:20:19,560 --> 01:20:23,380 this -- got very enthusiastic about its recognition that’s it’s even beyond genomics into 1038 01:20:23,380 --> 01:20:29,870 other technologies and better and better ways of measuring physiology and environmental 1039 01:20:29,870 --> 01:20:35,540 exposures and lifestyle. And, you know, this idea of -- and I’ll show you a paper just 1040 01:20:35,540 --> 01:20:41,290 from last year -- you know, the -- all these new sensors, these M-health devices that measure 1041 01:20:41,290 --> 01:20:47,600 all sorts of things about our physiology, our various analytes, our cardiovascular system, 1042 01:20:47,600 --> 01:20:53,680 and so forth. This is all sort of just at early stages and many people wear FitBits. 1043 01:20:53,680 --> 01:20:57,490 Those are recreational devices. They’re awful, but much more robust technologies are 1044 01:20:57,490 --> 01:21:02,679 coming. And one can imagine harnessing the power of those technologies, having these 1045 01:21:02,680 --> 01:21:08,060 individuals wear these devices, collect the data, and have that data streaming in to its 1046 01:21:08,060 --> 01:21:12,100 central data resources for scientists to analyze. Oh, by the way, it’ll stream in through 1047 01:21:12,100 --> 01:21:17,420 their smartphones, which a decade ago, about -- only about a million smart phones existed 1048 01:21:17,420 --> 01:21:21,480 and only two percent of Americans carried a smartphone a decade ago. But now over 60 1049 01:21:21,480 --> 01:21:23,799 percent of Americans carry a smartphone. 1050 01:21:23,800 --> 01:21:29,090 So one can imagine having an immense amount of genomic data, electronic health record 1051 01:21:29,090 --> 01:21:33,420 data, mobile health data, all streaming in on a million or more people, and that will 1052 01:21:33,420 --> 01:21:38,200 create an amazingly rich data resource, but it also means we have a lot of data analysis 1053 01:21:38,200 --> 01:21:43,559 to do. And data science will become a prominent feature of this. But you know what? As Tara 1054 01:21:43,560 --> 01:21:47,630 and Andy are going to continue to tell you in subsequent lectures, we’re in a new world 1055 01:21:47,630 --> 01:21:52,200 here. Data science is front and center in biomedical research, compute power’s gone 1056 01:21:52,200 --> 01:21:56,830 up 160-fold in the last decade. And we’re, you know, we’re not totally ready, but we’re 1057 01:21:56,830 --> 01:21:59,500 going to have to be ready because this is what we’re going to be doing as biomedical 1058 01:21:59,500 --> 01:22:00,550 researchers. 1059 01:22:00,550 --> 01:22:05,380 But the last element, which I do think will be interesting and important to watch that 1060 01:22:05,380 --> 01:22:10,550 will make this different, is the notion of how we will engage the individuals who will 1061 01:22:10,550 --> 01:22:15,130 be part of the Precision Medicine Initiative and the cohort, in particular. These people 1062 01:22:15,130 --> 01:22:20,720 will not be subjects. They’re not going to be patients. They’re going to be partners. 1063 01:22:20,720 --> 01:22:23,760 And the reason they’re going to be partners is that studies have shown and continue to 1064 01:22:23,760 --> 01:22:27,900 show that actually most Americans want to participate in biomedical research, but they 1065 01:22:27,900 --> 01:22:33,799 only want to participate if they are sort of engaged from the beginning that they know 1066 01:22:33,800 --> 01:22:38,090 what’s happening to their data, they get to opt in and out to things along the way, 1067 01:22:38,090 --> 01:22:42,270 and if they’re treated as partners. And from the very beginning of the planning the 1068 01:22:42,270 --> 01:22:46,830 participants are being featured as partners in the scientific enterprise. And I think 1069 01:22:46,830 --> 01:22:52,190 the whole social media/Facebook era is very important here in how we will engage them 1070 01:22:52,190 --> 01:22:55,429 through social media, through smartphones, and so forth. And it’ll be a new way of 1071 01:22:55,430 --> 01:22:58,430 doing science. I can tell you there’s a lot of cohorts that have been created in the 1072 01:22:58,430 --> 01:23:03,140 United States. This one’s going to be unlike any of the ones that have preceded it. 1073 01:23:03,140 --> 01:23:07,430 So there’s a lot associated with this. If you want to read the current blue print for 1074 01:23:07,430 --> 01:23:12,070 the Precision Medicine Initiative cohort, there’s a working group report that you 1075 01:23:12,070 --> 01:23:17,269 can read and for -- actually URL will be here. It’s a very convenient URL to keep in mind. 1076 01:23:17,270 --> 01:23:20,450 This is the landing page for the initiative, which hopefully will exist for the next 10 1077 01:23:20,450 --> 01:23:24,870 and 20 years or 30 years, because a lot of information about the initiative is being 1078 01:23:24,870 --> 01:23:30,240 put to make this a very transparent process on the landing page for the NIH -- NIH’s 1079 01:23:30,240 --> 01:23:33,040 effort in the Precision Medicine Initiative. 1080 01:23:33,040 --> 01:23:39,450 So I wanted to end by sort of tying it all together because it’s actually very interesting. 1081 01:23:39,450 --> 01:23:45,000 The Precision Medicine Initiative sort of gives me a very interesting sense of Déjà 1082 01:23:45,000 --> 01:23:50,100 vu because there are a lot of things we don’t really know how it’s going to play out. 1083 01:23:50,100 --> 01:23:55,080 It is an audacious -- yet another audacious effort and there’s so many uncertainties 1084 01:23:55,080 --> 01:24:01,830 associated with it. And it happens to be happening exactly 25 years or so after the launch of 1085 01:24:01,830 --> 01:24:05,930 the Human Genome Project. And if you would have asked me -- and I was there -- the day 1086 01:24:05,930 --> 01:24:10,900 the Genome Project started, if you would have asked me, “Well, how exactly are you going 1087 01:24:10,900 --> 01:24:15,710 to map and sequence the human genome?” I would have said, “I have no idea. But it 1088 01:24:15,710 --> 01:24:20,330 just is a compelling goal and I think we can do it. And we’ll figure it out as we go.” 1089 01:24:20,330 --> 01:24:23,920 And that’s exactly what we did in getting the Human Genome Project completed. We set 1090 01:24:23,920 --> 01:24:27,540 audacious goals and we were willing to change course as needed, and I’m telling you, it 1091 01:24:27,540 --> 01:24:31,960 feels exactly the same now 25 years later with the Precision Medicine Initiative. We’re 1092 01:24:31,960 --> 01:24:35,580 launching it, it’s audacious, it has these -- we’ve never done anything like this before, 1093 01:24:35,580 --> 01:24:38,670 and there’s so many details. If you ask me or the people who are going to be organizing 1094 01:24:38,670 --> 01:24:42,190 that on the front line, they’ll say, “Well, we haven’t really figured that one out yet,” 1095 01:24:42,190 --> 01:24:46,419 but that’s okay because they’ll figure it out as we go. I think these -- the comparison 1096 01:24:46,420 --> 01:24:49,870 is sort of a -- and the Precision Medicine Initiative will go on longer, actually than 1097 01:24:49,870 --> 01:24:54,500 the Genome Project, I predict. But it’s that same audacious willingness to sort of 1098 01:24:54,500 --> 01:24:58,940 change mid-course. But I think it’s absolutely required in order for it to be successful. 1099 01:24:58,940 --> 01:25:02,690 And so this is -- I gave you that bit at the end because I think it’s very exciting to 1100 01:25:02,690 --> 01:25:07,559 watch and maybe some of you will participate in this either as part of the cohort or as 1101 01:25:07,560 --> 01:25:11,500 researchers analyzing the data and I hope that you do. And there’ll be a lot of news 1102 01:25:11,500 --> 01:25:15,800 coming out in the coming weeks and months about it as we stand this initiative up in 1103 01:25:15,800 --> 01:25:17,750 the next year or two. 1104 01:25:17,750 --> 01:25:23,150 Lastly, if the topics I talked about and programs I talked about are of interest to you, I will 1105 01:25:23,150 --> 01:25:27,480 shamelessly plug this because it’s free. I put out a weekly -- not a weekly, that would 1106 01:25:27,480 --> 01:25:34,230 kill me -- a monthly newsletter that the institute’s staff helped me put together that highlight 1107 01:25:34,230 --> 01:25:37,509 things along the lines of what I described here. And if you want to get that feel free 1108 01:25:37,510 --> 01:25:42,450 to follow the link on this and you can subscribe to it and get a monthly newsletter from me. 1109 01:25:42,450 --> 01:25:47,000 So I realize I am coming up exactly on the time that was allocated to me so I will end 1110 01:25:47,000 --> 01:25:50,910 there and I would encourage anybody who needs to leave they should leave and maybe people 1111 01:25:50,910 --> 01:25:53,830 who have questions should just come down and see me at the podium so that we can finish 1112 01:25:53,830 --> 01:25:54,600 the official session in an hour-and-a-half. Thank you. 1113 01:25:54,600 --> 01:25:54,600 [applause] 1114 01:25:54,600 --> 01:25:54,610 [end of transcript]