From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Apr 01 2008 - 09:02:41 PST
I believe that when you say "there is an entry for that file in the proc
filesystem", that you mean that "ls -l /proc/<pid>/fd" shows something
for the file. If you mean something else, let me know.
As to what is happening wrong after you remove the vfs_unlink() from the
cr_mkunlinked() logic, I have only one guess based on your description.
My guess is that the file is getting created with a name like
".blcr_0123.456789ab" . Because this name starts with '.' the "ls"
command won't show it by default, try "ls -a". If this *is* what is
happening, then the issue is that the cr_mkunlinked() code is renaming
the file before creating it. The rename is done only when the last
argument ("unlinked_id") is non-zero.
Both the rename and the unlink are triggered by the non-zero
"unlinked_id" that cr_mkunlinked() passes to cr_filp_mknod(). It should
be sufficient to call cr_mkunlinked() with a zero value for the
unlinked_id argument, though you may need to remove the debugging check
that checks for non-zero value at lines 933-936 of cr_io.c (assuming
BLCR version 0.6.2 or newer).
Let me know if you need anything else.
-Paul
m.kumar@iitg.ernet.in wrote:
> Dear sir,
>
> While trying to implement BACKUP_RESTORE policy in file checkpointing, we
> came across a problem. Specifically, while restarting the process from
> it's context file, the file opened by the process ( outfile : opened by
> Examples/file_counting/file_counting ) does not get created on the disk
> filesystem.
>
>
> To implement the BACKUP_RESTORE policy, we have used your function
> cr_mkunlinked() (in file cr_io.c ) logic, with the modification that we
> are not doing vfs_unlink() in our version of the function. We think that
> doing this should create a normal file that is not unlinked ( since we
> are not performing the vfs_unlink() in our version ), even if we delete
> the file ( outfile ) by rm command after we have taken the checkpoint.
>
>
> However, this does not happen. NO file gets created on the disk
> filesystem, but there is an entry for that file in the proc filesystem,
> which was getting updated after we ran cr_restart. What we wanted was this
> file should have been created on the disk (if removed) and should have
> been updated like it is being updated right now in the proc filesystem.
>
>
> Could you please tell us what is wrong with our modification ( removal of
> vfs_unlink() from your cr_mkunlinked() function ) ?
>
>
> Thanks,
>
> Manish Kumar & Abhinav Jha
>
> IIT Guwahati - India
>
>
> -----------------------------------------------------------------------------------
> This email was sent from IIT Guwahati Webmail. If you are not the intended recipient, please contact the sender by email and delete all copies; your cooperation in this regard is appreciated.
> http://www.iitg.ac.in
--
Paul H. Hargrove PHHargrove_at_lbl_dot_gov
Future Technologies Group
HPC Research Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900