Daniel Kiper: Recovery of crashed Linux

This is a guest blog post by Daniel Kiper, one of our Google Summer of Code students. Daniel’s GSoC project is called Recovery of crashed Linux. Please welcome Daniel into the community.
My name is Daniel Kiper. I was born and live in Poland. I am a PhD student at Warsaw University of Technology, Faculty of Electronics and Information Technology. My research and thesis are focused on Air Navigation Services (ANS) delivery with the emphasis on Air Traffic Management (ATM) during contingencies. However, my core education and interests have been connected with IT and electronics since I was at high engineering school. What most interests me are operating systems (especially *NIX of any kind; over my professional career I have used Linux, AIX, Solaris and SCO Unix) and all types of virtualization. The best solution for me is to work on issues with a thin line between hardware and software.
Sometimes people are surprised that my PhD thesis is focused on Air Traffic Management. They think this topic is far away from computer science. But this is not true. Today, ATM and ANS, in a wider sense, strongly depend on a bleeding edge technology, especially on electronics and IT.
So far I have seen a few operational ATM centres and I have always been amazed how the technology can be used up to its limits to provide critical services. And there is no place for any error. There is no excuse !!! Everything must be perfect.
Additionally, many ideas, methods and algorithms used in IT are very similar to the ones used in the ATM, for example, a traffic simulation base on network flows and shortest path problems which are very often used also in computer
science.
What about my hobbies? I am passionate about my studies. Besides my studies, I am interested in science, technology and engineering issues of any kind (as I mentioned earlier especially IT and electronics). Additionally, I am a railway fan.
I am member of the Polish Association of Railway Enthusiasts. Last but not least, I am interested in Poland and the world’s history of the 20th century.
My GSoC 2011 Project
It is the second opportunity for me to take part in the Google Summer of Code. In 2010 my proposal, Migration from memory ballooning to memory hotplug in Xen, was chosen as the GSoC 2010 project. Jeremy Fitzhardinge
from Citrix was the mentor in this project. Additionally, I was strongly supported by Konrad Rzeszutek Wilk. Currently, the project is very mature and it will be included in the mainline Linux Kernel shortly. The participation in GSoC 2010 allowed me to broaden my knowledge in Xen hypervisor and Linux Kernel. Moreover, I was able to learn something about open source project maintenance and management. I think this knowledge can enable me to participate in this years GSoC 2011 more fruitfully and it can bring in better results for the Xen and Linux Kernel community.
My GSoC 2011 project is Recovery of crashed Linux and is mentored by Konrad Rzeszutek Wilk. The work plan is outlined, however, it can change a bit (maybe more than a bit) while carrying out the project, since some unexpected things may come out. It looks as follows:

  1. see how the HYPERVISOR_kexec_op hypercall works with PV kernel,
    baremetal kernel and Xen hypervisor
  2. integrate HYPERVISOR_kexec_op hypercall into mainline Linux Kernel; probably it is possible to use some code from Xen Linux Kernel Ver. 2.6.18
  3. create proper “transition” page table initialization
  4. ACPI/APIC/DMA/PCI/… initialization during reload of PV Linux Kernel as dom0; Xen hypervisor is not reloaded
  5. ACPI/APIC/DMA/PCI/… initialization during reload of PV Linux Kernel as dom0 and Xen hypervisor
  6. block device (PV) initialization (optional)
  7. network device (PV) initialization (optional)
  8. working kexec/kdump implementation on HVM domain with PV drivers (optional).

I am going to maintain kexec/kdump support for Xen after finishing the GSoC 2011. Also, I am looking forward to any comments and questions.
Picture Credits: ipa, PANSA, Marcin Wilkowski

Read more