How to recover from a BOE 4.0 patch gone bad


Recently I had the unfortunate experience where a patch upgrade went bad. The upgrade itself was not to blame but rather the uninstallation of the 13 patches I had applied starting with BOE 4.0 SP1. This system has been running for over 1.5 years and each patch applied was to resolve a known issue during beta and ramp-up. I was uninstalling the old patches to remove the 55 GB of uninstall data cache that had accumulated over the past 13 patch installs. This was a recommendation directly from SAP support as a means to free up the 55 GB of storage. However there was apparently an issue with one of the patch uninstall scripts that resulted in the removal of several key binaries from the install folders and sqlight database that tracks the binaries. All attempts to reinstall the patches and software where futile at this point. The system was unrecoverable and the only way to resolve the problem, while freeing up the 55 GB of space, was to perform a clean install of both the operating system and BOE 4.0 software. While this might sound like I just prescribed a computer lobotomy, rest assured that the process is not only straight forward but safe when applied correctly. The following information will outline the steps I followed to recover my system when only a rebuild will suffice.

    1. The first step and most important step is to backup the information stores and configuration files of BOE 4.0. There are three main information stores on BOE 4.0 that work together to form the repositories of BOE. The first store in the CMS system database. This is the database that the CMS service manages and it contains all the metadata for the reports in the system. Ask your DBA to create a backup of the database before proceeding “just in case”. The second and third information store consists of the Input File Repository and Output File repository. They can be found under:
      • \SAP BusinessObjects Enterprise XI 4.0\FileStore\Input
      • \SAP BusinessObjects Enterprise XI 4.0\FileStore\Output

These directories in larger clustered environment might already be placed on a network share therefore it is vital that you know and understand your configuration before proceeding. If the files are being stored in their default location on the server you are about to reformat, please be sure to copy these directories to a safe location. If they already exists on a network share, don’t worry as they will not be disturbed by the complete rebuild of the server. The final items to copy to a safe place included any custom .properties files you configured for Tomcat. There might be other Tomcat configuration that you need to backup as well. Please consult with your installation and configuration documentation before deleting any Tomcat files. If all else fails, ask the server admins to create a full system back that can be restored incrementally and to verify that a full system backup (before the issues occurred exists).

The remaining steps will guide you through the rest of the process.

      1. Make note of some key information in the CMC and server OS. These items are key to the setup of the new BOE 4.0 instance on the same server.
        1. The exact name of the SIA node associated with the server. In most cases this is the host name of the server but BOE 4.0 asks you to enter your own name now, so it might not be the same as the host. You can find this under Server -> Nodes in the CMC. In my case the node was named “DFTBOE40” (See Image: Link )
        2. The password associated with the Administrator account.
        3. The cluster key password.
        4. The boot drive letter and other drive letters. (C:\ and D:\ for example)
        5. The installation directory of BOE 4.0 (Full Path)
        6. All ODBC DNS, Oracle Clients configurations etc…
      2. Reformat all the partitions and reinstall the Operating system. Most likely your server administrator will help you with this but there are a few keys items to note. Make sure the new server has the same host name and same drive letters as the previous OS.
      3. Install all database clients and reconfigure the OS level connection details. (ODBC, tnsnames.ora, etc..)
      4. Once the OS is fully restored it is time to start the BOE 4.0 installation. Make sure to install the software to the same directory and drive. The servers in the CMS repository can contain path references to the old server install path. Because we are recovering the server from its previous state we want to keep the same install path to ensure seamless recovery.
      5. When asked for the new SIA node uses a temporary name and do not use the name you wrote down in step 2-a. This is important to the recovery as we need a temporary SIA and CMS.
      6. When asked for the CMS database and audit database use the existing databases. Do not create a new Database. Also make sure that you do not “reset the database” or you will lose all the configuration information and reports in the Database repository.
      7. At this point you can follow your normal installation steps.
      8. Once you have your system up and running using the temporary SIA and CMS you should logon and make sure all your users, groups, reports and universes show up in the CMC. You can now copy your input and output FRS directories back to the default location. If you are in a clustered environment using a UNC path for the FRS directories you can skip this step but note that you will not be able to view any reports until the FRS is restored.
      9. Using the CCM (Central Configuration Manager) you need to stop the SIA and change the CMS port and SIA port of your temporary node. The defaults should be CMS port 6400 and SIA port 6410. I changed mine to 7400 and 7410. With the ports reconfigured start the temporary SIA.
        1. This part gets a little tricky and the process is not clearly documented in the BOE 4.0 administrator’s guide. Pay close attention…
      10. Add a new node to your environment using the CCM. (See Image: Link ) When configuring the node specify your SIA node name from the previous environment. The port should be 6410 and the option should be set to “Recreate Node” (See Image: Link )
      11. Assuming that the recreation of the Node completes and the SIA starts with your original CMS service and other services you are ready to remove the temporary SIA and CMS. You can do this by stopping the temp SIA and then clicking the “X” delete button on the CCM.
      12. You can now login to the CMC and BI Launchpad to test all your old content. If you are experiencing issues that can not identified you can call support, employ a consultant or restore the last known good backup.

28 comments

  1. Ouch. I don’t envy you. However, I am approaching a situation that has some resemblance. I will be upgrading my client’s BI deployment from BO XI 3.1 to BI 4.0. There are 2 clustered, physical machines (beefy suckers) that are presently running XI 3.1 SP3 FP 3.1. I’ll be doing the test report conversion on virtual machines which will disposed when the upgrade is complete.

    However, my approach I would like to take with upgrading the production system is to shutdown SIA on 1 of the two application servers (and the corresponding web and database servers). I will then *uninstall XI 3.1* from that box and install 4.0. Then, migrate/convert the reports from the XI 3.1 node to the 4.0 node. Then uninstall XI 3.1 from the other app server, install 4.0, and cluster the two.

    Do you think I would run into similar issues with critical binaries being removed because I am uninstalling a 3.1 instance? I wouldn’t think so since 4.0 is a complete installation. However, it sounded as though you had OS level issues. Do you suppose those would similarly impact my proposed upgrade process?

  2. In general terms you should have no issue because 3.1 and 4.0 use different installers, directories, and paths. However I would assume a level of risk with the approach as there might be conflicts with .dll and binaries even after running the uninstall. If it were my choice I would uninstall BOE from one node (to remove the servers from the cluster) then reformat and re-install the OS. It is a cleaner approach that might save you time in the long run.

    Are you planning to use the Upgrade Management tool to migrate or use a copy of the CMS DB / FRS directories?

  3. I guess I was afraid that you would recommend the OS install. It’s not a problem. Naturally, it would just entail an extra step, a few more approvals, and collaborating with some system people.

    I intend to use the upgrade tool. Have you had better experience using a different method?

  4. In 4.0 you really only have the option to use the Upgrade Management Tool (UMT). However, there are a few problems or restrictions with the tool in 4.0 (not FP3) concerning large repositories. The UMT is not as friendly when performing incremental upgrades to mitigate or reduce the number objects migrated in a single batch. To work around these restrictions I have been creating .biar files using the 3.1 Import Wizard (containing only the objects I want to migrate regardless of dependencies), then using the UMT to import the .BIAR file. One example is importing just LDAP or WinAD groups from 3.1 to create them in the target with the same CUID. The UMT will automatically select all dependencies of the Group (User, User’s Inbox, User’s Favorites etc..) even if you only want to migrate the Groups to prep the AD / LDAP Setup. There are other instances where the .biar file method works better as well. Again, this all depends on the amount of content you are moving.

  5. Yes, these are good points. Thank you for your detailed explanation. You have a fantastic blog.

  6. Thanks Jonathan, your blog is a huge help (just subscribed today). I have a somewhat unrelated question. You mention, “If you are in a clustered environment using a UNC path for the FRS ” – I am running a clustered environment but a tech from SAP had told me that having the FRS on a share could cause file corruption because of the way BO does block level IO (or something like that) He said if it was a certain type of share (DFS maybe) then it wouls be OK. I would really like to have the FRS running on two nodes, is this safe? Thanks a lot.

  7. Robert,

    In order to cluster the Input or Output FRS, you need to configure them to use a UNC path (\\server\share\) via SMB or a shared mount via NSF (Unix). The Input FRS will have a different path from the output but all of the Input FRS in a cluster should have the same UNC path (and all output FRS should have the same as well). There is no need to worry about block level locking because only one Input or Output FRS runs as active in any given BOBJ cluster and the SMB or NFS protocols don’t use block level locking. Typically I use an existing independent NAS or File Server to host these files. (IE Something other than the BOBJ servers). I’m not sure who told you that at SAP, but they should probably go back for additional training.

    I have never used DFS but I could only imagine that could result in orphaned objects if the replication fails. In addition, it is not necessary given that both SMB and NFS are the recommended and time tested solutions.

    Thanks,

  8. Hiya Johnatan,

    I’d like to ask what about patches in this scenario of recovery?
    Let’s assume you have an SP2 full install, but the other node of cluster is already patched to SP5 P5… how would the process change in this case to apply patches during node recovery?

    Thanks,

  9. Krisztian,

    My approach is to perform the installation to the node using a temporary CMS and FRS. After I have fully patched the node I delete the SIA associated with that temporary install. I then recreate the SIA within the existing cluster. I don’t know that this is a technical requirement, but it is the safest approach. I have had issues in the past when mixed CMS.exe version were running in a cluster. Even though they are only temporarily out of sync, there is the potential for CMS DB corruption (or missconfiguration).

    Thanks,

  10. Thanks a lot. Btw. just another question for extras, you might have experiences with: if I’d like to add FP3 features to the platform meanwhile recovering – previously werent’t added, how would that affect the CMS and Audit DB? Still we are talking about a restoration of node of a 2-nodes cluster – other part working currently. Any consequences I can face with? Impacting the other node in any way?

  11. Sorry, also just a track-back comment on previous:

    if CMS DB is running with 2nd prod node and you’re going to install 1st node from scratch connecting to the same live CMS DB, in my case for first there will be different cms.exe, since SP4 FP3 would come up to operation after install, meanwhile we have SP5 Patch5 secondary node up’n’runnin’. What exactly on “Temporary CMS” you mean? FRS not an issue anywhere, since it’s UNC…

    Same goes for each Patch install still reaching final stage… Or am I missing something here? Can a CMS DB provide service for 2 nodes with different SP/Patch level? Clustering should be done as very last step, when both nodes are on same level, shouldn’t it?

  12. My key concern is using BOE 4.0 SP2 full installer on a cluster that is already running BOE 4.0 SP5. I would suggest that you use the SP4 Full Installer to avoid any potential issues. In my case I avoid all risk by using a temporary CMS for the install of the new node. Once I am on the correct versions, I then recreate the new node SIA in the cluster.

  13. Sure, I will, wanted to go with that (SP4 FP3), then SP5. So there’s no tricks at all with CMC and audit databases, right? Like copying and restore at other server/schema… am I correct?

  14. Hi Jonathan,

    I am trying to recover a BusinessObjects server by installing it all over again on a virtual machine. I am not sure what to do with a certain step. You said that I must choose to use an existing database. Should I install the SQL server database on the virtual machine before I start the installation, and somehow copy the CMS DB datafiles, or should I give the path to the existing database on the server with the ‘broken’ BusinessObjects server?

  15. Assuming that your DB was on another server, you can reinstall using the default DB. Then use the CCM to re-point the CMS. You have to also have a copy of the old input/output FRS folders to make this work.

  16. We are trying to do so right now. Still we have a question: why create a temporary node with a different name, and not create directly the same node, since it is a new installation, on a new machine?
    Thanks,
    Andreea

  17. The article assumes that you lost your BOBJ environment and need to restore it while retaining the original content. If that’s not what your doing, then this article does not apply to your situation,

  18. Our customer uninstalled about 7 patches, after that the server node would not recognize user/password and returned error that server is down. Before uninstalling they copied the C:\Program Files(x86)\SAP BusinessObjects\ folder on another partition. This folder and the CMS database + audit is all we had. Hopefully your solution works..

  19. In that case, the reason you use a temporary CMS,SIA or CMS database is to protect the original CMS DB server records while you patch the server back to the same state. You can use the original CMS database with a temp SIA and CMS, but you have to be very careful not to corrupt that DB. I like using a seperate DB during the rebuild process but you can use the original if that is all you have. Once you get the binaries back to the correct version, use the CCM to point your new CMS to the old DB. If you used the original DB, simply follow the instructions in the article. You then copy your FRS files back over. You have to recreate your SIA node after re-pointing to update the server records. — The last two times I did this it work without issues. Although, you have to be very careful not to miss a step. This is only one way to do it, SAP support might give you another.

  20. Hi Jonathan

    Thanks for your blogs!!

    Need your views on our latest plans to upgrade our BI 4.0 SP2 environment to Sp7. We have a Test environment with following configuration:

    1) 2 BI nodes on Suse Linux 10 ( BI 4.0 SP2 Patch18) + 1 BODS node on Windows 2008 R2 (SAP Data Services SP2 Pacth 5)
    2) Tomcat on Linux server
    3) CMS DB on Oracle server
    4) FRS on Primary BI node

    Now the challenge here is we are planning to re-platform our existing Linux systems (for BI nodes) to Windows 2008 R2 hence bringing everything on windows platform.

    My queries are:

    -> Can we use our existing CMS and File stores when we do fresh SP7 build install on windows node to replicate the previous Test environment on Linux? Or do we need to migrate reports, users, universes etc from any of our co-existing environments like Dev?
    -> Can we keep our BODS node as it is as its on Windows already and bring it back to the new cluster? There are some part of BODS (al_jobservice) which is managed from BI primary node.
    -> What precautions or best practices you would suggest in this scenario?

    This might be a consulting query but any suggestion would be really helpful.

    Thanks
    Ritz

  21. Hi Jonathan, We are going through worst senario. our admin dint have Cluster key. instead of resetting to new one when its pointing to Cluster. he asked DB to install new server. now nerver is setup. when we are trying to point to old cluster its asking for cluster key. we failed and we created new node and moved tables from old cluster DB to New DB . we placed OLd Copy of BAckup files from INPUT and OUTPUT FRS still its asking for the cluster key. can u suggest what need to do to bring up old reports

  22. Jonathan,
    In the above it’s showing that installing from a SP4 full build against an existing SP5 CMS. What about just doing an add-on server install using 4.1 SP5 when the existing cluster is 4.1 SP5 P6? (and then patching the new host to P6 immediately after install?)

  23. If they are on the same SP level you should be ok. I have done this scenario many times before. Just make sure you have a backup of the CMS database and FRS directories before you cluster in the new node.

  24. I realize I’m posting on an old thread, but I just wanted to thank you and Google for helping me out. Worked like a charm. Cheers!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s