Announcement

Collapse
No announcement yet.

I solved my problems with DR Spawner with a really crazy workaround

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • I solved my problems with DR Spawner with a really crazy workaround

    Hi everyone.

    After heavy testing I recently solved my problems with DR Spawner. Now I have a nearly 100% working solution.

    I had problems using DR all the time ending up in crashed nodes an unfinished renderings where I had to restart all the nodes manually. That was really pain in the ass, so I tried to figure out a reliable way. And I made it. But it´s a somewhat crazy workaround, so people tell me what you think about it.

    So this is my solution:

    1. Register spawners as service
    Register the spawners as service, restart the machines, then they are present all the time with "vrayspawner90.exe" and max running in background.
    (You have to restart the machines, because the service is registered but not started. You can alternatively start the service manually)

    2. Give spawner-service access-right to network-drives
    If necessary modify the service in the computer-administration (I do not know what this is called in english Windows, my OS is german) to start under a certain username with the appropriate password. You have to restart the service afterwards.
    (This was necessary to me, because by default the spawner and the subsequent max-session start under the user "SYSTEM". I do store my files on a NAS-Linux-server, so there is no user "SYSTEM" - and therefore spawner and max do not have access to this drive - and therefore they do not find any maps or XRefs. I have to provide a user with password for the "vrayspawner 90"-service and then max does render fine. I think this might be a solution for many users reporting the problem that the rendernodes seem not to find the maps and XRefs. I heard very often of the solution to store all the maps and XRefs locally on each machine, but maybe this solution may adress this problem in many cases.)

    3. Batch-starting, -stopping and -restarting the nodes
    Provide *.bat-files to start, stop and restart all spawner-services on all nodes with psservice. See PSTools Website.
    (This way you are able to start, stop or restart all rendernodes at once. So if any node crashes simply restart them all. Many thanks to Clifton Santiago for the tip.)

    4. Very important: Everything in your max-file has to be UNC!
    All maps, XRefs, *.ies-files - everything that has to be OLE must use UNC-path-names.
    (If not the nodes simply crash, do nothing and have to be restarted. This is not the case if the spawners - that is to say the file "vrayspawner90.exe" in the max-root - are started manually and not via service. Started manually the nodes load files from mounted network-drives properly.)

    ---------

    So good so far. Now everything is in principal prepared for DR-rendering. With me it does work in principal - but...

    ...it´s not stable. The nodes always do crash after a certain amount of renderings (5 to 15, it´s never the same). This is much more unstable if Win XP Home-machines are involved in the rendering-task (that´s the case for me). When one or more slaves crash you now can use your *.bat-files to restart the nodes, but this is not very handy.

    Now the crazy stuff begins:

    I decided to automate rendernode-restart after every rendering.

    ----------

    5. Restart all nodes via maxscript after every rendering
    So i wrote a max-script that starts the restart.bat automatically after every render. This script checks if VRay is the current renderer and if "Distributed rendering" is checked (does not make sense to restart nodes if you are not rendering with the nodes). It has to be placed in the autostart-folder for scripts (restart max). The result is that when you render with DR all the slaves start rendering and when the rendering is finished all slaves are restarted automatically, so that they are clean and fresh for the next rendering.

    Advantage: DR rendering successes for me in nearly 100%. Sometimes a node quits rendering immidiately, but it seems that this does not disturb the other nodes in the most cases. So the rendering is finished anyway but maybe with one node missing. After finishing all nodes are restarted and fresh again for the next rendering.

    Disadvantage: The restart of every node needs some time. In my case I use 5 nodes and it lasts about 30 seconds to restart them. So I have to wait 30 seconds before I can start the next rendering. If you start too early maybe none of your nodes is ready and the workstation only renders. Maybe only some of them are ready, I think this can confuse the DR when another node signs up while processing the job. But I think the ready-ones are involved, the not ready - not... I think in most cases this is no disadvantage, because after finishing a rendering you start to work on your scene again and not many tweakings are done in 30 seconds.

    ------------

    So in sum with this solution I am very very happy, because now I can use DR - without this kind of reliablility it wasn´t usable at all.

    If you want to try this workaround: This is the "restart VRay-spawners.bat"-file I work with:
    Code:
    psservice \\Node01 -u MyUsername -p MyPassword restart "VRaySpawner 90"
    psservice \\Node02 -u MyUsername -p MyPassword restart "VRaySpawner 90"
    psservice \\Node03 -u MyUsername -p MyPassword restart "VRaySpawner 90"
    psservice \\Node04 -u MyUsername -p MyPassword restart "VRaySpawner 90"
    psservice \\Node05 -u MyUsername -p MyPassword restart "VRaySpawner 90"
    psservice \\Node06 -u MyUsername -p MyPassword restart "VRaySpawner 90"
    You can download the needed tools and usage-information here: PSTools Website.
    Tip: The psservice.exe needs to placed in the Windows "System32"-folder.

    ...and this is the maxscript I use in the startup-folder:
    Code:
    /*
    damRestartVRaySpawners
    (c) Sascha Selent, 2008
    mailbox@deepartmend.de
    
    Launches .bat-file to restart VRay Spawners via psservice
    */
    
    fn damRestartVRaySpawners =
    (
    	ShellLaunch "C:\\restart VRay-spawners.bat" ""
    )
    
    callbacks.addscript #postrender "if (classof renderers.current == V_Ray_Adv_1_50_SP2) and (renderers.current.system_distributedRender == true) do damRestartVRaySpawners()" id:#damRestartVRaySpawners
    You have to modify both codes for your needs.

    ----------

    So, may this can help some people to get the stuff to work and maybe it can give the people at ChaosGroup some ideas to do things better, but...

    ...I think DR is completely out when RealtimeRendering is in, eh? But who knows? Maybe some issues adress RealtimeRendering as well?

    Best regards.

    Sascha

    EDIT: And just to mention the benefit of DR: My little cute testscene, rendered with DR = 4min1sec, without DR = 11min6sec
    Last edited by Sascha Selent; 05-11-2008, 09:57 AM.

  • #2
    This is just like I have done it a while, except the last maxscript.. should work great!

    Comment


    • #3
      Noone´s interested?

      Originally posted by aphex79 View Post
      This is just like I have done it a while, except the last maxscript.. should work great!
      Hi aphex79
      Works great!

      Hihi, seems that noone else is interested in this solution ?!? And I thought I found the holy grail of DR-rendering...

      ...anyway, for me it is the holy grail. Solved all my problems with DR-rendering...

      Comment


      • #4
        Sascha,

        thank you for the post,
        I have the same problem and I am sure that PStools would solve it , I used this same method in the past on an older version of Vray.
        Problem is I can't get PStools to work anymore, do you know if it works on Win64 ?

        Comment


        • #5
          Originally posted by jean-piere View Post
          Problem is I can't get PStools to work anymore, do you know if it works on Win64 ?
          Uh, sorry man, I´m still 32-bit... Can´t help here...

          Comment


          • #6
            they do work on 64bit
            Dave Buchhofer. // Vsaiwrk

            Comment


            • #7
              Hi Sascha -
              I'm trying to follow your method here - thanks for sharing it!

              For the maxscript to automate restarting: do you need that installed on each of the nodes? And do you need to have pstools on each node as well?

              So far I have managed to get the pstools restart to work from my workstation, but that is as far as I have gotten.

              Thanks in advance for any help.
              b
              Brett Simms

              www.heavyartillery.com
              e: brett@heavyartillery.com

              Comment


              • #8
                Ps Tools Work on XP64bit.

                No need to restart the spawner, that is, only in special circumstances.
                Do a pskill on the 3dsmax.exe process on the slaves instead. It is a lot faster, as the spawner starts a new max nearly immediately.
                That's how I have been doing it at least.
                Signing out,
                Christian

                Comment


                • #9
                  Thanks Christian - I will try that method. Currently having another issue that may drive me back to running it manually if I can't figure it out though

                  b
                  Brett Simms

                  www.heavyartillery.com
                  e: brett@heavyartillery.com

                  Comment


                  • #10
                    Originally posted by simmsimaging View Post
                    Hi Sascha -
                    For the maxscript to automate restarting: do you need that installed on each of the nodes? And do you need to have pstools on each node as well?
                    No, just on the master machine...

                    Comment


                    • #11
                      Too hasty...

                      I think I was a little too hasty...

                      I still have dropouts, and it seems that the heavier the renderjob is and the more nodes are involved, the more they drop. Simple quick tasks render fine nearly without drops, but heavy jobs in high-resolution fail very often.

                      It seems to happen when getting from one pass to the next.

                      E.g. when rendering with IRM for the primaries and LC for the secondaries all node render the LC fine, but when they try to start the IRM-pass some fail or when they try to start the final pass some fail...

                      Seems that there is still work to do...

                      ...what can it be? What can be done? I´m despaired...

                      Comment


                      • #12
                        I did not spend the time you did, nor do I have the expertise, but after quite a bit of efforts over the last week or so I gave up on having a foolproof system.

                        My machines have all been switched back to manually started DR spawners, the service versions caused too many problems, and I just have to keep an eye on it for nodes dropping out. At least that way the worst case is some nodes do not contribute, but with spawner as a service I often got black buckets and ruined renders.

                        I am going to try to find some way to automatically quite and restart the DR spawner (vrayspawner90.exe) as a program, and then maybe find some way to have that happen automatically when the machine sits idle for more than a few minutes. I have found that quitting the spawner and restarting it manually does not seem to cause major problems with a DR render, at least not so far, so the risk seems minimal.

                        b
                        Brett Simms

                        www.heavyartillery.com
                        e: brett@heavyartillery.com

                        Comment

                        Working...
                        X