Report

Operating systems— now and in the future

To view this page ensure that Adobe Flash Player version 9.0.124 or greater is installed.

Get Adobe Flash player
Please login or register to make a comment!

Operating systems 4 now and in the future Dejan Milojicic Hewlett Packard Laboratories 1501 Page Mill Rd. Palo Alto, CA 94304 dejan@hpl.hp.com 12 IEEE Concurrency W ith this issue, we are starting a department on the trends in computer science. In particular, we are interested in controversial aspects and competing interests, hence the title cTrend Wars. d This department will address the trends that have attracted attention recently, possibly marking a significant change in the way we perceive aspects of our world and work today.

This first installment addresses the field of operating systems. I have spent over 15 years working on OSs, system software, and distributed systems. With the recent appearance of Java and the Web, it seemed to me that the amount of research and development in the OS field was declining.

Many of my col- leagues felt the same way, and many changed fields. Indeed, I spent the sub- sequent two years working on middle- ware systems. Indicators of decline in- cluded the reduced number of Unix systems being developed, domination of the Microsoft NT at the lower end, and the attractiveness of the Java and Web research that redirected R&D efforts from the OS field.

However, despite an initial ... more. less.

perception of decline, OS R&D never really stop- ped. We saw successful OS events, such as SOSP and OSDI, and the amount of industrial investment in OS develop- ment grew, as did the applicability of OS principles to middleware-level systems. The market presence might rebalance and realign dominance in low-, mid-, and high-end computer systems, but OS development seems to exist at each level.<br><br> To put trend OS research and devel- opment in a more rigorous perspective, we asked six prominent OS researchers and developers for their opinions: David Black (EMC Corp.), Bill Bolosky (Micro- soft Research), Frans Kaashoek (MIT), Jochen Liedtke (IBM TJ Watson Re- search Center), Jeff Mogul (Compaq Western Research Lab), and John Wilkes (Hewlett-Packard Labs). In the discus- sions that follow, we include their replies to the following five groups of questions. 1.<br><br> OS research today and in future: (a) What problems should OS re- searcher try to solve? (b) What are the problems that OS research should not try to resolve? (c) Com- pare OS research with the past (amount, scope, topic, type, and so forth).<br><br> (d) Extrapolate to future OS research. 2. OS development today and in future: (a) NT versus Unix (where is the boundary, and how will it move), what is the future OS of high-end servers?<br><br> Will we end up with only one industrial OS? (b) What is the future of special-purpose OSs (such as for embedded systems, real-time, and fault-tolerant systems)? (c) What is the impact of contemporary stor- age solutions (such as network- attached storage)?<br><br> (d) What is the impact of contemporary networking solutions? 3. Free versus proprietary: (a) Will free OS ever achieve the maturity stage of free tools (such as GNU), Linux phe- nomena, and similarity to BSD evo- lution?<br><br> (b) Which platform is more suitable for what research? 4. Transfer of research to industry: Micro- kernel, extensible kernels, and virtual environments 4what is next, and (how much) does it matter?<br><br> (a) Have micro- kernels failed? How much microkernel experience has been deployed? (b) Extensible kernels have been around a while 4has the technology been trans- ferred to industry?<br><br> (c) What are the chances for virtual environments to be deployed in industry? (d) What is the next topic d 9jour? 5.<br><br> What is the impact to OS R&D of (a) mainframe OS technology (such as processor power, memory size, fault- tolerance requirements, I/O) (b) Web 4what is applicability of OS principles? Will Web subsume OS research? (c) Java and alike 4are Java, Jini, JavaOS, and other products sub- suming OS?<br><br> Readers are encouraged to com- ment on these discussions. A fuller ver- sion of these interviews will appear in the TCOS newsletter. Dejan Milojicic is a senior scientist at the HP Labs, where he leads the kernel internals group of the multicomputer systems program.<br><br> He has worked in the area of OSs and distrib- uted systems for over 15 years. He is the pro- gram chair of the forthcoming Agent Systems and Applications Symposium (ASA 999), the chair of the IEEE Technical Committee on OSs (TCOS), a member of the editorial board of IEEE Concurrency , and on the executive committee of the IEEE Technical Commit- tee on Internet. He received a BS and MS in electrical engineering from University of Bel- grade, Yugoslavia, and a PhD in computer sci- ence from University of Kaiserslautern, Ger- many.<br><br> Contact him at Hewlett Packard Labs, MS 1U-14, 1501 Page Mill Rd., Palo Alto, CA 94304; dejan@hpl.hp.com; http://www.hpl. hp.com/personal/Dejan_Milojicic/. Trend Wars .<br><br> David Black OS research today and in the future 4what problems should OS researchers try to solve? There are three of them. One is modular construction.<br><br> OSs have been surprisingly resistant to the kinds of modular construc- tion techniques that we see in software engineering. Among the main reasons for this is performance. The second is distributed and mobile systems.<br><br> Distribution is a long-standing problem, but it is not completely solved and, mobility 4the desire to put computation in small things that move around and are intermittently and unreliably connected 4 changes things. The best approach here is to build fundamen- tal software components that are useful, as opposed to rein- venting the entire system from scratch. And finally, we have the impact of other technologies on OSs, including special-purpose OSs.<br><br> There are a whole col- lection of technologies such as distributed-systems security, the levels of networking above TCP/IP, and storage that are all in very interesting states of flux, creating opportunities to do new and interesting work. Can you compare today 9s OS research with OS research in the past? As OS research progresses, the way that research succeeds changes.<br><br> At one point, replacing Unix in toto was the path to success. That is no longer the route to research success because the important problems have changed. Unix is now a mature system, whether we like it or not.<br><br> I don 9t mean to denigrate Linux or the people working on it, but a lot of that is engineer- ing rather than research. In contrast, I think the rapidly expand- ing impact of embedded systems is an area where very interest- ing things are possible and, in fact, are going to happen. What is the future of special-purpose OSs?<br><br> A bright one, especially for embedded systems. I consider much of what people think of as real time to be embedded- systems work. The decreasing cost of technology opens up an ever-increasing spectrum of opportunities for limited- and ded- icated-function devices.<br><br> They are optimized to do a small num- ber of things extremely well, as opposed to a general OS that needs to support almost everything in some reasonable fashion. Fault-tolerance is different because it 9s in an earlier stage of evolution. At the moment, the original approach of using fault- tolerant hardware and OSs is now being replaced by software fault-tolerance technology based on clustered general-purpose hardware with general-purpose OSs.<br><br> It is going to open up opportunities such as specialization of software fault-tolerance for embedded systems. And if you bring in real time, now you have a serious research area. What is the impact of contemporary storage solutions?<br><br> Switched fiber-channel interconnects are going to funda- mentally change the way that storage works and the way that the OS community thinks about storage. In essence, disk stor- age, which we now think of as being installed in the computer, is about to become distributed and sharable. Fibre Channel speaks SCSI to storage rather them TCP/IP; this is about the disk becoming distributed underneath the file system and raises issues different from past work on distributed file systems.<br><br> There are a wealth of interesting research issues in this area. What is the impact of contemporary networking solutions? There are several things going on here that are interesting to the OS community.<br><br> One is that interconnects are getting faster and more reliable, making much tighter system coupling pos- sible. What we used to think of only being doable as a multi- computer sitting inside a box is now doable by going out and buying parts. On the other end of the spectrum, mobility is also becoming increasingly important in contemporary networking and raises a whole new set of issues due to dynamic variations in speed and reliability.<br><br> Beyond that, one finds increasing band- width in the core of the Internet, but most OS research is on hosts that are not hooked up directly to the Internet core. Network security is in a state of flux and will impact systems in ways that might not be obvious. Interestingly, the OS com- munity used to know how to distribute security, namely via capabilities.<br><br> There was a lot of work on capability-based sys- tems. Networks didn 9t go that way, and we are only now reach- ing the point with digital certificates and related technology that interesting security systems based on ideas like capabili- ties have become possible. Which platform is useful for which research?<br><br> One starts from the class of research and then looks at what one wants to do. System analysis obviously needs an existing system. Understanding the behavior of existing OS technology has become more important as that technology has become more complex.<br><br> For research that changes the way an OS does something, one needs an OS that is actually being used. The primary reason is to prove that the research addressed the corner cases. One wants to avoid having an embarrassing little thing that doesn 9t work turn into a thread that unravels everything when it 9s pulled.<br><br> A virtue of having done the integration is that it enables an apples- to-apples comparison of how the new function works in the sys- tem versus how the system does that function. This sort of side- by-side comparison is convincing. Beyond that, big-picture ideas should clearly start from scratch, pulling in existing system components for areas that aren 9t the target of research, such as device drivers.<br><br> I don 9t think anyone considers device drivers to be an important topic of research, so by all means yank those from someplace else rather than spending research time reinventing them. The actual selection of platform: Linux, NT, BSD, what- ever, depends on a host of issues, many of them nontechni- cal 4familiarity and preference of the researchers, the need for and availability of sources, et cetera. Implicit in that comment is that I think systems like Linux and some of the BSD vari- January 3March 1999 13 .<br><br> ants are certainly commercial-class OSs for the purpose of research. Research results done in those systems are as valid as research done in a commercial product. Can you comment on the transfer of research to industry?<br><br> Both microkernels and extensible kernels went after the problem of customizing a shared multipurpose system so that it could be extremely good for something while retaining gen- eral-purpose support. In 20/20 hindsight, hardware has gotten cheap enough that the general-purpose support aspect of this problem is far less important than when this class of research started. Embedded special-purpose systems are being built that reflect some of the specialization ideas that these systems devel- oped, but not the overall protection, extensibility, and separa- tion of functionality frameworks.<br><br> What 9s the next topic d 9jour? That 9s like predicting next year 9s trend in fashion. Good luck.<br><br> What about Java? Java is an important technology, but still has performance problems. This is why JavaOS did not take off.<br><br> Predicting the future here is tough. System management appears promising, because in the management space, performance becomes much less of a make-or-break concern compared to the flexibility, and write-once, run-everywhere properties that are Java 9s forte. Have you ever seen systems benchmarked on management operations per second or management throughput?<br><br> David L. Black is a senior technologist at EMC Corporation working on software architecture and design issues for enterprise storage. His profes- sional interests encompass networking, real-time systems, security, and distributed computing.<br><br> He was one of the key developers of the Mach OS at Carnegie-Mellon University, and received his PhD for work on that sys- tem. At the OSF (later The Open Group) Research Institute, he was a key contributor to a broad spectrum of projects involving Mach technology in both its integrated kernel and microkernel forms between 1990 and 1998. His network quality of service activities include coauthoring the Differen- tiated Services header and architecture RFCs.<br><br> Contact him at the EMC Corporation, 42 South St., Hopkinton, MA 01748; black_david@emc.com. Bill Bolosky What problems should OS researchers be trying to solve? The most interesting problems are related to managing very complex systems built from independently derived components.<br><br> That is, how can we do what many industrial places have been trying to do for a long time 4figure out how to take stuff built from many different places, put it together, and have it work consistently? I don 9t think anybody quite understands how to do that. We 9ve certainly seen this problem at Microsoft.<br><br> We were well aware of how difficult it is to install software on NT and get it to work. Some people might think that 9s because the folks who designed Microsoft OSs aren 9t all that bright, but that 9s not really what was going on. It 9s just that we were farther out on the curve of having to deal with a large number of vendors and larger, more heterogeneous bases of hardware.<br><br> So, we were just getting hit by the same problem that 9s going to hit everybody else later. Another interesting problem is how, if at all, we are going to take the existing medium- to small-scale OSs that we have now 4such as Unix and NT 4and get the kind of scale and reliability that we see driving the enterprise. The things that make getting scale and reliability hard 4harder for us at Micro- soft than for IBM, who did in it in the past 4are that we don 9t build the hardware and that the hardware that we run on is miserable.<br><br> This is true especially for PC hardware and some- what for workstation hardware. The memory systems aren 9t as good; the I/O systems aren 9t as good. They might perform as fast as mainframes, but they 9re not as reliable.<br><br> But we 9re never going to get hardware that costs what PCs or workstations cost to be as reliable as the much more expen- sive mainframe hardware. So, the only hope for competing is to use clustering. What do you think about NUMA (nonuniform memory access) machines?<br><br> They seem to be coming into the game again. I did my dissertation on NUMA, and I became quite disil- lusioned with it by the time I was done. The problem with NUMA software programming models, which is even worse with DSM (distributed shared memory) systems, is their premise: We 9re going to export this nice, familiar, single- address-space, multithreaded programming model, and the programmers will be able to just write programs as they did for SMP use, and everything will just work.<br><br> That, it turns out, is easy to do, but getting it to work really efficiently is hard. Also, because caches have gotten so much faster than main memory, getting through memory takes so long that you really have to think about when you do it, and you have to be really careful about doing it. Once the programmer has to be aware of where all the memory is, you 9ve lost many of the programming model advantages that were the goal of doing NUMA and DSM.<br><br> What other problems should OS researchers be trying to solve? One problem is security. In the traditional security model, which evolved from time-sharing systems, the machine lived behind a glass wall and you trusted the people who touched the machine.<br><br> That just ain 9t so anymore. The machines now live in everybody 9s houses, and people can do anything to their machines that they want. Now we 9ve got this wonderful global communications infra- structure in the Internet.<br><br> We can already deliver books, music, and software, and we 9ll soon have the bandwidth to deliver video. However, the people who produce and own those things get paid by copyright. So how we can build systems that run in hostile environments but make it impossible to steal content?<br><br> Can we make it so that Tom Clancy is willing to sell his book as an HTML file? If we can figure out how to solve that, we 9ve 14 IEEE Concurrency . helped the world out a lot, because Tom is not going to be giv- ing his books away in an easy-to-pirate form, and books are particularly easy to steal because they have so few bits in them.<br><br> But if I could go to Amazon.com and, instead of having them send a book through UPS, just download the book and still be able to pay the royalties, book prices will go down and the ease of distribution will go up. This will help increase the diversity of authors, and it will be good for pretty much everybody except the people who make paper. But there doesn 9t seem to be an easy solution to this problem.<br><br> You can argue that it 9s not an OS thing, but I think it really is, at least in part. What problems should we not try to resolve? Well, I don 9t like DSM for the reasons I just told you.<br><br> Also, single system image is nice if you can do it right, but it 9s hard to get right because of latency issues. Absolute performance has been a big focus of OS research. We 9ve all played this game 4 cMy code executes in 10 fewer cycles than your code executes. d This might sound heretical, but I think that absolute performance becomes increasingly less important as machines get faster.<br><br> More and more total cycles will get spent executing things other than the OSs, so that the importance of the cycles in the OSs will go down. This has already hap- pened to some extent. How does past OS research compare to current research?<br><br> A lot of what we used to do in OSs focused on performance, and that was really important when we ran on a one-MIPS VAX, with 20 people on it. Any cycle you spent anywhere caused everybody grief. It 9s not like that anymore.<br><br> Also, a lot of what we did earlier was solving very fundamental problems 4 how you build a file system that doesn 9t lose data, how you build a network that doesn 9t collapse when you put 10 people on it. We have pretty good solutions to those sorts of problems now, and we 9ll see less emphasis on that kind of stuff and more on trying to solve some of the problems that I talked about earlier. Will we end up with one industrial OS?<br><br> Probably. Which one? It won 9t be MVS, because the pres- sure from the bottom is too hard to overcome.<br><br> In the long run, IBM is playing a losing strategy there. Eventually, people will figure out how to solve the heterogeneity and the unreliable hardware problems, and the price for performance is so much better for the small-machine model than for the big one. It 9s just a question of whether the small model will win in the next decade or whether it 9s going to take 30, 40 years.<br><br> Will OSs ever achieve the stage of tools? OSs will never be quite the same as tools. When you 9re look- ing at C, both ends of what C does are very well-specified.<br><br> You can go out and read the C manual and see what the input is and the back end, where it talks to the OS. When you look at what OSs have to do, the top end is fairly well-specified 4there 9s a Unix API, a Windows API, or what- ever 4but the bottom is hardware. It 9s amazing how much of it doesn 9t work the way it 9s supposed to, and the OSs have to be tailored to deal with the idiosyncratic behavior of all the machines.<br><br> I 9m not just talking about devices that you plug into slots; fundamental things on motherboards don 9t work on com- puters that get shipped. So, there 9s some advantage to having a small set or a singleton set of OSs that deal with these things, because fewer points of communication are needed from the hardware vendors to the OS producers, and changing things is easier. That 9s a miserable reason for OSs not to be commodi- tized; maybe hardware will eventually get better, but if history is any guide I wouldn 9t bet on it.<br><br> The turnaround cycles on pro- ducing hardware are getting smaller, not larger, and the prices are getting lower, not higher. Those things all reduce quality. Bill Bolosky is a researcher in the Operating Systems Research Group at Microsoft Research.<br><br> He is currently helping design an aggressive new dis- tributed OS as part of the Millennium project. He is also working on a com- ponent for Windows NT called the Single Instance Store that will allow NT file systems to have only a single on-disk instance of files of which there are logical copies. Contact him at Microsoft Research, One Microsoft Way, Redmond, WA 980927; bolosky@microsoft.com; http://www.research.<br><br> microsoft.com/os/bolosky/. Frans Kaashoek What problems should OS researchers try to solve? OS researchers should try to work on prob- lems that are motivated by applications.<br><br> They should have an application in mind when try- ing to solve a particular problem. In particular, I think they should try to focus on problems that result in either new functionality or dramatic improvements in performance. Small, incremental per- formance improvements are basically very uninteresting.<br><br> One of the challenges for the OS community in general is to be cfar out. d In industry, start-ups are moving fast and are push- ing the boundary. So one of the challenges, I think, for the sys- tems researcher is to try to think ahead of where the start-ups are and look at what the next set of problems are going to be. What about OS research in the past?<br><br> In terms of past OS research, I guess golden times are ahead. Because I think more from a systems perspective than from a pure OS kernel perspective, the importance of appliances, sen- sors, and the fact that more and more functionality is being pushed into software will generate new interesting problems. For example, with software radios and things like that, there 9s going to be more and more opportunity for systems research to influence a broader set of issues than ever before.<br><br> Can you extrapolate on future OS research? Things look great. Lots of interesting opportunities, lots of interesting problems, and a lot of new problems.<br><br> Just read the newspapers. January 3March 1999 15 . For example, recent articles have described how the number of computer appliances will dominate desktops systems.<br><br> These appliances have many interesting system problems. This is also an opportunity to get out of the Wintel tar pit. It 9s not clear what kind of systems are going to run on these appliances, but there 9s an interesting opportunity to diversify in terms of the number of available operating systems.<br><br> In terms of problems, we 9re talking about software having control over aspects and managing aspects of systems that it didn 9t have control over before. For example, on the proces- sor side, I think future processors will expose more and more resources to the software levels, which means additional resources to manage and coordinate. For instance, in the RAW project at MIT, the hardware is a number of simple proces- sors that are connected together with the software controlling every aspect of the processor 9s execution.<br><br> As long as we 9re talking about operating systems, NT vs. Unix: Any thoughts? I think there will be room for both of them for a long time to come.<br><br> Again, I think this is because of the emergence of new types of applications and appliances other than desktop com- puters. What will run in these appliances? I see a number of products that run Linux.<br><br> I think this trend will continue to grow. Also, I think the class of embedded operating systems will grow more and more important. Nevertheless, NT and Unix will be here for a while.<br><br> With these special-purpose operating systems, what do you think about real time and predictability? Again, I think appliances, sensors, and actuators will even- tually run (or are running) software and have microprocessors. The issues of embedded and real-time systems are reasonably important.<br><br> Sensors are going to generate lots of streams of data and researchers will need to be careful to design systems that can meet the appropriate deadlines. But, I think it will be more software real time than hardware real time. As for predictability, if anything, it will become more important.<br><br> The more things are interconnected and the more our physical world phenomena are run by computer-con- trolled sensors and actuators, the more reliability and pre- dictability will become dominant issues. It 9s the same with security. Do you have any comments on the future in networking or storage?<br><br> I think its going to be Ethernet, IP, and HTTP. And maybe, HTTP will be the IP of the future, in the sense that everything will run across HTTP as opposed to IP. However, I wouldn 9t want to bet on that one.<br><br> With storage, I think future disk drives and other compo- nents are going to have their own processors. I think the point here is that processors are so inexpensive and there is plenty of room to stick some processing computing or capability on an appliance or in a device. All these devices will be talking IP.<br><br> Having a general processor in an IP-talking device will fun- damentally change these devices and open up room for inter- esting research and applications. Have microkernels failed? I don 9t think microkernels have failed.<br><br> Certainly a number of embedded operating systems are all microkernel-organized or microkernel-centric. But more importantly, I think a lot of software and large software systems tend to be organized like microkernels, with IPC between the different components and certain functionality implemented as servers. I think the whole idea of splitting things up in smaller modules that communi- cate with IPC has been extremely successful.<br><br> What about extensible kernels? I think that it 9s a little bit too early to tell, but I think some of that technology is currently being transferred to industry. For example, there is a startup doing exokernels and a number of com- panies that are playing around or evaluating whether they 9re going to employ an exokernel.<br><br> That 9s just one example of an extensible kernel, which I 9m very familiar with. Whether extensible kernels going to be successful in the long run, I don 9t really know. There 9s a lot of other social and economic issues involved.<br><br> Will the idea of extensible kernels be important in the future? Yes, I think so. What are the chances of virtual environments being deployed in industry?<br><br> Well, I think virtual machines are playing a more important role than ever before. I guess Mendel Rosenblum 9s work is one example, but there is also the Java VM, which is certainly play- ing a very important role and will continue to do so in the future. Virtual Machine technology is going to have a good future.<br><br> I think in a number of projects it will provide a great platform for operating systems research. What is the applicability of OS principles? Will the Web subsume OS research?<br><br> I think the Web is going to be an area where there is going to be an incredible amount of systems research. When people start pushing new types of applications with new requirements (for example, secure) on top of the Web, there will be more and more need for better distributed-computing technology than currently exists. It 9s going to be a fruitful area of research.<br><br> What about Java? I think Java has a bright future. Frans Kaashoek is an associate professor in MIT 9s Department of Elec- tric Engineering and Computer Science and a member of the MIT Lab for Computer Science.<br><br> His research interest is computer systems: OSs, networking, programming languages, compilers, and computer architec- ture for distributed systems, mobile systems, and parallel systems. He received his Doctorandus degree and PhD from the Vrije Universiteit. Contact him at the MIT Laboratory of Computer Science, Technology Sq., Cambridge MA 02139; kaashoek@pdos.lcs.mit.edu; http://www.pdos.<br><br> lcs.mit.edu/~kaashoek. 16 IEEE Concurrency . Jochen Liedtke OS research today and tomorrow: what prob- lems should researchers try to solve?<br><br> I think the general problems are things like where to concentrate on understanding that we can 9t predict results, particularly perfor- mance results, when we construct systems 4OSs, application sys- tems, or subsystems. Of course, we have the traditional goals and problems to attack 4performance correctness, there 9s the whole security problem, and new perhaps is composibility, or composi- ble, highly configurable OS flexibility. As a basic problem, we have size.<br><br> We have huge memories compared to 10 years ago. Even with mainframes, we have large sizes of processors and so on. What are the problems OS research should not try to solve?<br><br> There are no such problems, because research is something that should attack, by definition, everything. The question is when do you skip something because you think its no longer promising? I wouldn 9t exclude anything.<br><br> Research basically is about unexpected ideas, or unexpected new insights. Can you extrapolate to future research? In general, you should never predict something, so extrap- olation to future OS work is nothing I feel I should do as an OS researcher.<br><br> OS development today and in the future: NT versus Unix? You may throw dice; I have no idea. I 9m not an expert in social sciences or gambling theory.<br><br> I can 9t comment substan- tially on this. What is the future of special-purpose OSs? That depends on what you call a special-purpose OS.<br><br> Are real-time or embedded systems more specialized? This is com- mon a misunderstanding we have as OS researchers, to think that time-sharing or a workstation system is general purpose. Try to control a car with such a system.<br><br> It won 9t work. What we basically have are more or less specialized systems, and I very strongly hope that we will come out with a composible OS tech- nology that lets us specialize systems for various applications. For example, the real-time embedded systems, workstations, or servers might be specialized from the same set of modules, components, tools, and so on.<br><br> So you want a general technol- ogy, which ends up in a more or less specialized system. How does this idea of composibility rank with some of the OS prin- ciples such as cdo one thing but do it right? d On a very low level, you need something like that. On the hard- ware level, we agree that you need a processor, which is not very specialized.<br><br> On the next level, let 9s say the microkernel nucleus level, you need something that is also pretty general, but very soon you might come to components that you can 9t modify, that you can 9t exchange, so you can compose. The principle of con- centrating on one thing and doing it right doesn 9t contradict to the idea of composibility. On the contrary, composibility, if it really works and if the technology really works, permits you to concentrate on one module or one component and then inte- grate it in something and then concentrate on the next module.<br><br> Of course, the way of composing itself is the centerpiece of this technology, so really optimizing this and generalizing it is basic. Free versus proprietary: will OSs ever achieve the status of tools (GNU, C, and so forth)? I 9m no prophet.<br><br> That 9s the problem 4I 9m not even execu- tive manager. I 9m OS researcher. I really can 9t comment very deeply on the principle.<br><br> I suppose you mean free OS. I would have to think about this because OSs are much more complex and much faster evolving than sequential tools, like the GNU compiler or other typical tools. I 9m not sure.<br><br> OS maintenance and development are much more expensive in two categories 4people and money. It 9s not clear whether, for instance, Microsoft has enough engineering skills and engineering power in this game. It 9s also not clear that the open-source community has enough money.<br><br> So I don 9t know. Transfer of research to industry: what is next and does it matter? Yes, I think that everything you mentioned in this list is basi- cally a facet of the same fundamental approach.<br><br> This funda- mental approach is basic systems architecture 4under- standing fundamental OS basics, building general, flexible mechanisms for support system construction on all levels and for hopefully all purposes, and finding powerful minimum ele- gant concepts and principles for this. The buzzword is not so important. The crucial question is this approach; will this find its way into real life, industry practice, and so on?<br><br> I 9m pretty sure that it will because it 9s simply necessary. Practice needs basic mechanisms to construct things; otherwise we won 9t be able to construct systems for the next decade. So what is your assessment of the success of the transfer of microker- nel to industry, for example?<br><br> It has been already almost 15 years since they started. I 9m pretty optimistic. Of course, I am a microkernel fan.<br><br> Microkernels will do it but probably all of this will influence, on short or long term, OS research. The battle between NT and Linux makes us think it 9s less important and chances are lower, but for the real foundations, for the real technology, the Linux/NT battle is not so important. The success of a sound foundation of OS technology will make its way into industry.<br><br> There is no doubt. What is the impact of OS R&D? With a mainframe, technology triggers the OS research.<br><br> In a way this is evolutionary, but in the last five or eight years, we have seen that our understanding of how systems work has to change dramatically. There are two basic points in the core OS field: the caches and the multiprocessor systems. Before caches, the computer basically worked like a machine consist- January 3March 1999 17 .<br><br> ing of a processor and memory. The processor was basically responsible for time and memory for space. Think about this as two typical criteria for performance.<br><br> But the last five years, we recognized that current OSs became 10 times slower as soon as the processor became 10 times faster 4the OS 9s prim- itives and code took the same amount of actual time, whether you had a 10-MHz, 100-MHz, or 5-MHz processor. What this really shows us is that the system that we are think- ing about, and we are doing research and development on it, is processor-cache-memory 4and the cache completely changes the game. We begin to realize that this is important, but we still do not really understand that.<br><br> We cannot predict what hap- pens. The performance implications of caches 4they clearly influence how systems have to be constructed, including OSs as well as applications 4in particular, real-time multimedia, where predictability is important. That 9s still an open question.<br><br> Looking for methods, let us understand how the system works. We can clearly predict how a system will behave before we construct it. Of course, this is also an optimal basis for find- ing optimum methods, to construct systems so that they per- form optimally on cache-based systems.<br><br> So basically, the point here is really understanding the entire processor-cache-memory system and finding general meth- ods to utilize it in an open environment, where you cannot predict everything 4for instance, in a workstation or a server system. This is a major challenge. For SMPs, it 9s even harder because the caching problem is a little bit harder for SMPs and you also get the bus problem.<br><br> The main focus is the same understanding so you can predict what happens. Jochen Liedtke is a research staff member at the IBM T.J. Watson Research Center.<br><br> His research focuses on fundamental OS architecture, particularly microkernels, integrating systems architecture with hardware architecture, and on constructing composable OSs. Contact him at jochen@watson.ibm.com. Jeff Mogul NT vs.<br><br> Unix: what is the future OS of high-end servers? Should we end up only with one indus- trial OS? Compaq and a number of other compa- nies have established a balance between NT and Unix, and maybe even some proprietary OS, in their prod- uct lines.<br><br> I have a gut feeling that no single system will ever take over in the foreseeable future, just because there are different requirements 4especially in servers, where the requirements are probably much more varied than on desktops. When you say, would we end up with one industrial OS only 4not for the fore- seeable future, in large part because it would be very hard to sat- isfy everybody 9s needs with one. What is the impact of contemporary storage systems?<br><br> Much past OS research was aimed at the performance of such systems and their reliability. The future problem is more likely to be how to manage large collections of systems. If you have 1,000 disk drives and you want to add a few more, you can 9t nec- essarily just go out and plug in another controller.<br><br> In fact, scala- bility and manageability seem to be the key issues for large stor- age systems. That is where the network systems are going to shine, because networks are inherently more scalable. What is the impact of contemporary networking solutions?<br><br> People have been saying about IP, for more than a decade, that it will run over anything, including a tin can on a string. The strength of the Internet protocol suite was, in large part, its abil- ity to run over any network interconnect, with the difference being performance and availability. People now understand how to build network systems that are relatively independent of the transport hardware, with the exception that some applications require more bandwidth or less latency than you can get over a certain inter- connect.<br><br> This is going to have minimal impact on OS research, although as you get another order of magnitude of bandwidth or a similar improvement in latency, this might enable new applica- tions that might force OS people to rethink some things. Will free OSs ever achieve the maturity stage of free tools? Linux, as a freeware OS, probably has more attraction than some of the other freeware tools.<br><br> People might be willing to pay for a really high quality compiler, one that ensures that applications actually do the right thing, than perhaps for an OS. There is no question that Linux is a growth business right now. You read the trade press and it is clear that people with suits on have adopted Linux.<br><br> BSD 9s evolution is interesting. The post-Berkeley BSD peo- ple, the various different fragments of the BSD market, never really caught on to what was necessary to achieve world domi- nation, which is the term that Linus Torvalds often uses. Linus has been very good about keeping Linux as basically one OS, and making sure that it runs on enough hardware, to support device drivers, and also to support a lot of applications.<br><br> Whereas the BSD systems, they fragmented that market and never really attacked the notion of being something that everybody wants to use. They were a lot more in the research market. Linux origi- nally had the clet 9s get lots of users and then move up d approach.<br><br> Which platform is more suitable for what research? The place where you would have a hard time doing inter- esting OS research in those freeware systems is where you are really trying to go for the last ounce of performance, and some- body 9s proprietary version of Unix or some other OS has spent a lot of effort matching the OS to the hardware, or dealing with complicated multiprocessor scaling issues. Transfer of research to industry: Have microkernels failed?<br><br> What about extensible kernels? Since I work in a research lab inside an industrial concern, this is clearly an issue that is of interest to us. When you ask 18 IEEE Concurrency .<br><br> if microkernels have failed, that is asking the wrong question. Some people might have expected that microkernels were going to replace everything. Maybe 10 years ago, that was a valid expectation.<br><br> Clearly, that hasn 9t happened, and it is not likely to because they don 9t necessarily solve the problems that some people need to solve. There are, of course, com- mercial systems that use microkernels and use them quite successfully. They seem to be more interesting for the em- bedded market, because especially in that market you need to have some way of getting the smallest possible system, and also reconfiguring the system for specific applications.<br><br> Extensible kernels is a much more recent issue. There again, it is not clear whether the problem that the extensible kernel people are trying to solve is really the problem that a lot of industrial OS vendors care about. Extensibility has its prob- lems.<br><br> For example, it makes the customer-support issues a lot more complicated, because you no longer know which OS each of your customers is running. What are the chances for virtual environments to be deployed in industry? You mean virtual machines?<br><br> Some of those applications are extremely attractive. Especially Mendel Rosenblum 9s work [at Stanford] with some of his students, taking a multiprocessor with some number of actual processors and turning it into a virtual multiprocessor, so that you can balance the load over a different number of virtual processors. That seems very inter- esting.<br><br> The scalability issue, as people deal with larger and larger systems, is probably one of the key ones to solve, largely because systems nowadays mostly fail for software rather than hardware reasons. The more different eggs you have in a bas- ket, the more problems you have when it fails 4especially for people running multiple applications on their large-scale multi- processors. They probably really want applications to be fire- walled from each other.<br><br> There is also the possibility of run- ning different OSs on the same piece of hardware, which gets back to whether there will be one OS or multiple ones. Could you run Unix and NT on the same box at the same time? I 9m not sure how interesting that really is.<br><br> It might be important if some people try to emulate the support applications that only exist on another OS. We 9ll see how that plays out. The Microsoft model is if you run another application you buy another computer and another OS license, where the Unix model is you try to run as many different applications at once.<br><br> We 9ll see which one of those models wins out, if any, in the industrial world. What problems should OS researchers try to solve? A lot of people have been looking over the years at simple performance issues, and often you see papers that have fairly impressive peak-performance results.<br><br> There are two issues related to performance that might be more important to solve. One is the issue of predictable performance, especially from the standpoint of somebody who 9s actually selling computers as opposed to doing research. Your customers want to know how much of a computer do they need to buy to solve a par- ticular kind of problem, and if their user base is growing by X percent a year, when are they going to run out of computer?<br><br> Ideally, you would like to be able to predict that as far in advance as possible, and, if possible, not by actually going through the expense of applying the system to real users and benchmarking it. So there is a lot of work that could be done trying to predict system performance in advance. Especially from the viewpoint of the OS communities 4we 9re the ones who can deal with latency issues.<br><br> A bridge designer can tell you, based on centuries of civil engineering and experience and nowadays fairly accurate CAD programs, well, we can allow a certain num- ber of trucks loaded with this many bricks to go over this bridge before it will reach its safety margins. We don 9t really have a way of predicting OS performance in advance, and this is impor- tant as we try to mature our ability to design computer systems. The other thing is the problem of robust performance: wor- rying about worst-case performance rather than best-case per- formance.<br><br> For example, if one of these disks in my RAID array goes offline, I might still be able to do I/O, but it might become an order-of-magnitude slower if I 9ve done something wrong in the overall system design. So it 9s very important to look at ame- liorating worst-case performance, as opposed to improving best-case performance. After all, that 9s really what gets people annoyed 4when they 9re sitting in front of an airline reserva- tion system and suddenly it 9s a hundred times slower than nor- mal.<br><br> That 9s much more of a problem than trying to increase the peak performance by 10%. Jeffrey C. Mogul is a researcher at the Compaq (formerly Digital) West- ern Research Laboratory, working on network and OS issues for high-per- formance computer systems, and on improving performance of the Inter- net and the Web.<br><br> He received an SB from MIT, and an MS and PhD from Stanford. He is a member of the ACM, Sigma Xi, and CPSR. Contact him at Compaq Computer Corp., Western Research Lab., 250 Univ.<br><br> Ave., Palo Alto, CA 94301; mogul@pa.dec.com; http://www.research.digital.com/ wrl/people/mogul/bio.html. John Wilkes Can you comment generally on OS research today and in the future? I would first like to make a distinction between computer science and computer engineering.<br><br> I think it was Christopher Stra- chey who asked, cIs computer science? d Most OS research tends to be in the engineering space, which means that you cannot do it in isolation from the requirements of the clients or cus- tomers. In turn, this means that you cannot say whether a research idea is intrinsically a good thing: it is only a good thing if it helps solve a problem that somebody either has or will have. Unfortunately, not noticing this is reflected a little too often in a form of undesirable behavior that the OS research field exhibits: generating a single experimental answer and claiming it is ctruth. d Mark Weiser from Xerox has for many years encouraged the OS January 3March 1999 19 .<br><br> community to be more aggressive about repeatable experiments. We need to encourage this. In fact, anything we can do to make OS research more like a physical science would be a good thing.<br><br> You 9re saying that it is much more applied to computer science? An OS 9s purpose is to provide a protected resource alloca- tion scheme together with a set of services to its clients (the applications and middleware that it supports). There 9s a ten- sion between those two goals 4the more services you provide, the more you tend to dictate how the research allocation is done.<br><br> So we see people moving back and forth between a really simplistic, low-level microkernel model and very sophisticated, high-function services 4OS/360 comes to mind. Most of the published work in the OS research field seems to have evolved quite naturally from looking at the problems that we as researchers see sitting in front of us. We have a noble her- itage from small-scale, time-sharing systems such as Unix and NT 4and these are the basis on which we have all done our work.<br><br> We have pushed the use of such systems down onto desktops and made the workstation business possible; we have applied such sys- tems to small-scale servers for file sharing and database engines. We haven 9t done such a good job at the very high end 4and the kinds of problems that show up in very large-scale commercial systems. These are problems of scale, manageability, performance, and stability in both performance and staying up in the face of failure.<br><br> As a result, our work has failed to touch many of the peo- ple who should be more direct beneficiaries of our work 4the people who buy computer systems for commercial purposes, rather than be ends in themselves. This is an area where industry has invested huge amounts of effort 4and where there are still enormous opportunities for doing a better research job. At the low end, I don 9t think we 9ve done a particularly good job of providing solid underpinnings for coping with small, resource-starved systems, having deferred a fair amount of the work to the real-time systems community.<br><br> A lot of published OS research work here takes the form: cHere is a system that we are trying to help support; it has the following desired prop- erties &; these are the hacks we put in place to be able to cope with the restrictions that have been imposed on us. Aren 9t we clever? d Not enough is done to address the principles behind the tradeoffs that are intrinsic to such systems. OS work is easy when resources are in ample supply.<br><br> It is when they run out that you have to start being rather smarter than average. However in real-time and standard OSs, wasn 9t it preferable to push down more resources and either processor power or memory? That was the easiest way to resolve the problems as opposed to solving com- plex problems?<br><br> That argument says that we should do nothing until every- thing we ever want to do has been made to fit into a cellphone. I don 9t think it is a very sensible way to do business. If you are in an engineering profession, which is what we are, there will always to be cases where processor cycles, I/O bandwidth, bat- tery lifetime, interconnect performance, cost, memory size, screen size, or something, are going to be limiting factors.<br><br> What about the impact of contemporary storage solutions? At the high end, there are issues of robustness in various fla- vors, as people put more and more data online and make their business rely on it to an ever greater extent. They have to be able to tolerate outages of more and more extreme kinds 4including entire site disasters.<br><br> The trend is towards ever more robust sys- tems 4you mustn 9t ever lose data 4but there 9s a continuum of needs because the solutions have different costs. The kind of robustness that requires absolute continuous availability in the NASA spacecraft sense is not typically required in commercial systems. On the other hand, some companies now do essentially all of their business online, so they 9re terrified that they might lose the $10 million order that would pay for the computer hard- ware if the system goes down for just one ten-minute period.<br><br> This is an area ripe with opportunity, as networks start to make inroads into traditional storage system designs. Simi- larly, tools to administer very large-scale systems are lacking. We don 9t do a great job of helping people design, configure, manage, and evolve these extremely large, very complicated systems, some of which are at the bleeding edge of what we are able to put together and keep running.<br><br> OS research today and in the future, what problems should OS researchers try to solve? One important problem worthy of attention is better pre- dictability, by which I mean the ability for a computer system pur- chaser and user to be able to know how that system will behave, and whether it will meet their needs, across a range of events, usages, accidents, and failures. This will entail learning how to write and enforce servile-level objectives across the system.<br><br> Another area is to do a better job of tieing into the next-gen- eration processor architectures. If you go back to our slogan of cprotected resource allocation, d the processor is one of the most important resources that we get to control, and we had better do an excellent job of it. This means ensuring that it is producing as much value to its users as possible 4even across changes in processors, since OS-interface lifetimes are usually considerably greater than processor-architecture lifetimes.<br><br> Occasionally, we lose track of making sure those resources are being allocated to applications doing useful work at the expense of providing fancy abstracted services. We need to pay more attention to greater agility in the imple- mentation space. We made a stab many years ago at introducing microkernels for making different components of the system modularizable and replaceable.<br><br> Frankly, I don 9t think it came out particularly well. The important software engineering message of microkernels got lost in the implementation that was chosen to prototype them. It is probably an area worth revisiting.<br><br> The people operating in the extensible-systems space are making a valiant attempt at a different approach to this prob- lem. I am not yet particularly enamored of their solution because it seems to be concentrating too much on the mechan- ics of taking one application and putting bits of it on either side of a protection boundary and not enough on how the seman- tics of the OS innards are used, and how multiple applications 20 IEEE Concurrency . interact at more than a simple processor-cycle-time usage level.<br><br> Solutions here would allow us to make more rapid changes in the implementation bases that we know and love. For example, I 9ve long been saying inside Hewlett-Packard that we should be trying very hard to make our OS aggressively support third- party suppliers of components in a way that preserves the reli- ability of the single-source system. Mechanisms and techniques to make this possible with high predictability should be the goal, rather than simply speeding up a few specialized applications.<br><br> So you consider all of these techniques, such as microkernels, and to a lesser extent extensible kernels and others, as a means to composibility? Yes. They give you an option of some late-bound flexibility after the original designers write their stuff.<br><br> In some cases, unfor- tunately, the late binding is very late 4it 9s at runtime, so on every single request you make a decision as to how the OS gets to do something. It seems likely that there is some intermediate point that does earlier binding, leaving the runtime stuff to the things that really need it 4or for which it is irrelevant. The resource-lim- ited soft real-time people have been forced into this space.<br><br> They usually bend over backwards to provide a great deal more flexi- bility in earlier-binding of configurations than the traditional time-sharing model. This is, I think, one of the downsides in starting from a system whose genesis was general-purpose time- sharing. It does that job extremely well, but anytime you try to move away from that sweet spot somebody screams that you 9re stopping support for their particular application.<br><br> And that 9s what you have to do to provide better support for some other subset. Is it fair to aim at composibility at the kernel level or to shift focus more to the middleware level? For example, Java, DCOM and oth- ers, provide some other means.<br><br> How would you balance the amount of composibility applied at each level? This sounds like a trick question! You need to do some of each.<br><br> Composibility is like performance guarantees 4you can 9t do it all at one level. If the components (such as the OS) are small enough that picking one of them gives you the flexibility you need to support your particular set of applications, you 9re done. But if it turns out that each such element represents many mil- lions of lines of code, which is where we are now for our time- sharing systems, then it 9s probably too big a lump to swallow in one piece, and composition has to happen at multiple levels to achieve the desired effect.<br><br> It would be good to have better ways to do this 4especially if the outcome could be more predictable. Composibility needs to happen all the way up and down the sys- tem. Automated design systems that helped people compose systems to do what they wanted in terms of manageability, con- figurability, memory footprint, performance, cost, and fault tol- erance, would be powerful tools, and well worthy of research efforts across the OS community.<br><br> They would find immediate applicability in the field of embedded systems. John Wilkes is the manager of the Storage Systems Program at Hewlett-Packard Labs. His main research interest is in the design and management of fast, highly available, distributed-storage systems; he also dabbles in network architectures (the Hamlyn sender-based mes- sage model), OS design (most recently in the Brevix project), and learn- ing about early Renaissance art and architecture.<br><br> He earned a BA and MA in physics and a Diploma and PhD in computer science from the University of Cambridge. Contact him at Hewlett-Packard Labs, Mail Stop 1U-13, PO Box 10490, Palo Alto, CA 94303-0969; wilkes@ hpl.hp.com; http://www.hpl.hp.com/personal/John_Wilkes/index.html. January 3March 1999 21 .<br><br>

less

Copyright © 2010 beepdf.com. All rights reserved.