I have used DFSR for some time now and had only great experiences with it, however it was low volumes of data, This time I was implementing this as a High availability solution for a web farm, We had about 40 gigs of data with a massive amount of files to replicate…. But the scheme of things 40 gig of data is really not that much.
Anyways cutting to the point after reading articles about the best way to proceed with a quick initial replication was said to be copy the files to the destination server, so I did this via robocopy, keeping all attributes and the permissions intact.
However contrary to popular believe this was what ultimately caused so much grief, as it started the replication off and I my event log was filling up with:
Event Type: Information
Event Source: DFSR
Event Category: None
Event ID: 4412
Date: <Date>
Time: <Time>
User: N/A
Computer: <Computer name>
Description:
The DFS Replication service detected that a file was changed on multiple servers. A conflict resolution algorithm was used to determine the winning file. The losing file was moved to the Conflict and Deleted folder.
Additional Information:
Original File Path: <File path>
New Name in Conflict Folder: <Folder name>
Replicated Folder Root: <Folder name>
File ID: <File ID>
Replicated Folder Name: <Folder name>
Replicated Folder ID: <Folder ID>
Replication Group Name: <Replication group name>
Replication Group ID: <Replication group ID>
Member ID: <Member ID>
see more about this here: http://support.microsoft.com/kb/944804
After further investigation, this was because the file ID of the files on the destination server differed from the source server…. Thanks robocopy.
This was filling up my ClonflictAndDeleted folder very quickly, with a lot of what I though was unnecessary crap. Never the less I let it run for a few days and I cam back to find the below event log:
Source : DFSR
Catagory : None
Event ID : 2104
Type: Error
Description :
The DFS Replication service failed to recover from an internal database error on volume F:. Replication has been stopped for all replicated folders on this volume.
Additional Information: Error: 9203 (The database is corrupt (-1018)) Volume: DB587759-DC0B-11DC-940D-00304888DB13 Database: F:\System Volume Information\DFSR
Brilliant I had a corruption
Possible Solutions
Taken from Google Groups [1]
I recently had a spat with the "new and improved" DFSR and wanted to let everybody in on the proceedure for reseting a DFSR member.
First off, removing everything using the GUI doesn’t help when the database is corrupt. DFSR keeps the database regardless of its membership status. So if for example you had a broken DFSR server and removed it from every replication group, when you added it back you’d still be out of luck.
To clear it completely after the server is no longer a member of *any* dfsr replication group (i.e. remove it from all of them in the gui and wait for AD replication to propgate the changes):
1. Stop the "DFS Replication" service.
2. On the drive(s) in question, grant yourself full permission to the hidden system "System Volume Information" folder.
3. Navigate into the folder and delete (or move to be extra careful) the DFSR folder.
4. Navigate to each replication group the server was a member of and delete (or move to be extra careful) each hidden system "DfsrPrivate" folder.
5. Start the "DFS Replication" service.
You may now treat the server as a brand new member for the replication groups. Now all you need to deal with is DFSR’s sloppy initial replication routines (hint: those missing files are in the "DfsrPrivate \PreExisting" folder).
http://www.eventidwiki.com/index.php?title=Event_ID_:_2104%2C_DFSR
However this did not work the folder would not rename under 2008 even with UAC off, this did work for me though
1. Click Start, right click Command prompt and click run as administrator to open a command prompt window, then go to driverletter:\System Volume Information\dfsr prompt, type the command below to rename it:
Ren “old folder name” “new folder name”
I did this on both servers participating in the replication. further to this I deleted the folder I was replicating on the destination server, and let DFS do all off the creating.
after 2 days I had a fully functional DFSR, working the way it should!!!!!
Some commands I found useful through the process:
dfsrdiag backlog /rgname:”cluster replication” /rfname:websites /rmem:RECEVINGSERVER /smem:SENDINGSERVER >c:\backlog1.txt
You might also find the %systemVolume%windows\debug folder useful.
Good Luck.