基于Windows 2012R2/2019域成员部署0告警Oracle Enterprise Manager Cloud Control 13c R3

本文涉及到的安装介质基线为:

setup_em13300_win64.exe
setup_em13300_win64-2.zip
setup_em13300_win64-3.zip
setup_em13300_win64-4.zip
setup_em13300_win64-5.zip
setup_em13300_win64-6.zip

en_windows_server_2012_r2_vl_with_update_x64_dvd_4065221.iso

上次在工作组环境部署RAC之后就接着部署OEM 13c on Windows 2019,结果一直不成功。先是安装到20%时hang住,重来一遍又到50%时hang住,想着这OEM难道比RAC还难搞,一看资料少的可怜,而且基本没人在Windows下搞这玩意儿。又去查Google,偶然间看到VirtualBox和VMware Fusion环境下都有hang的例子,于是借鉴到vSphere平台,修改了vmx的参数设置,顺利解决hang的情况:

如果你是VMware Workstation平台,则还是直接修改vmx file
monitor_control.enable_fullcpuid = TRUE
cpuid.4.4.eax = "0000:0000:0000:0000:0000:0000:0000:0000"

解决hang的问题之后,安装过程又在80%的地方反复报错,一度以为OEM与Windows 2019的兼容性有问题,于是又变更为Windows 2012 R2,问题依旧。

emctl.log的相关重要细节:
2020-01-19 21:51:31,080 [main] INFO  wls.OMSController main.220 - Executing emctl command : start
2020-01-19 21:51:35,361 [main] INFO  commands.BaseCommand getEnvProps.486 - nm home replaced : C:/app/gc_inst/user_projects/domains/GCDomain/nodemanager
2020-01-19 21:51:35,377 [Thread-2] INFO  commands.BaseCommand run.605 - <ERR>System error 5 has occurred.
2020-01-19 21:51:35,377 [Thread-2] INFO  commands.BaseCommand run.605 - <ERR>
2020-01-19 21:51:35,377 [Thread-2] INFO  commands.BaseCommand run.605 - <ERR>Access is denied.
2020-01-19 21:51:35,377 [Thread-2] INFO  commands.BaseCommand run.605 - <ERR>
2020-01-19 21:51:35,377 [main] INFO  commands.StartCommand startOMS.452 - net start return code is 2
2020-01-19 21:51:35,377 [main] ERROR commands.BaseCommand logAndPrint.651 - Windows service OracleManagementServer_EMGC_OMS1_1 failed to be started
2020-01-19 21:51:37,426 [main] INFO  commands.BaseCommand logAndPrint.653 - Oracle Management Server is Down
2020-01-19 21:51:37,426 [main] INFO  commands.BaseCommand printMessage.413 - statusOMS finished with result: 8
2020-01-19 21:51:37,442 [main] INFO  ctrl_extn.EmctlCtrlExtnLoader logp.251 - Extensions found: 1
2020-01-19 21:51:37,442 [main] INFO  ctrl_extn.EmctlCtrlExtnLoader logp.251 - Executing callback for extensible_sample
2020-01-19 21:51:37,442 [main] INFO  ctrl_extn.EmctlCtrlExtnLoader logp.251 - jar is C:\app\middleware\plugins\oracle.sysman.emas.oms.plugin_13.3.1.0.0\archives\jvmd\em-engines-emctl.jar; class is oracle.sysman.emctl.jvmd.JVMDEmctlStatusImpl
2020-01-19 21:51:39,479 [main] INFO  ctrl_extn.EmctlCtrlExtnLoader logp.251 - Connection refused: connect
2020-01-19 21:51:39,479 [main] INFO  ctrl_extn.EmctlCtrlExtnLoader logp.251 - rsp is 1 message is JVMD Engine is Down
2020-01-19 21:51:39,479 [main] INFO  commands.BaseCommand printMessage.426 - extensible_sample rsp is 1 message is JVMD Engine is Down
2020-01-19 21:51:39,479 [main] INFO  commands.BaseCommand logAndPrint.653 - JVMD Engine is Down
2020-01-19 21:51:39,479 [main] ERROR commands.BaseCommand logAndPrint.651 - Please check C:/app/gc_inst/em/EMGC_OMS1\sysman\log\emctl.log for error details
2020-01-19 21:51:39,479 [main] INFO  commands.BaseCommand getEnvProps.486 - nm home replaced : C:/app/gc_inst/user_projects/domains/GCDomain/nodemanager
2020-01-19 21:51:40,213 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>
2020-01-19 21:51:40,213 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>Initializing WebLogic Scripting Tool (WLST) ...
2020-01-19 21:51:40,213 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>
2020-01-19 21:51:47,985 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>Welcome to WebLogic Server Administration Scripting Shell
2020-01-19 21:51:48,001 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>
2020-01-19 21:51:48,001 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>Type help() for help on available commands
2020-01-19 21:51:48,001 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>
2020-01-19 21:51:48,048 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>nm home is C:/app/gc_inst/user_projects/domains/GCDomain/nodemanager
2020-01-19 21:51:48,048 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>OHS component name: ohs1
2020-01-19 21:51:48,048 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>bip_only_start = TRUE
2020-01-19 21:51:48,048 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>bip_only_start is true, ms_name = BIP
2020-01-19 21:51:48,048 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>is_admin_host is TRUE
2020-01-19 21:51:48,048 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>admin_start is FALSE
2020-01-19 21:51:48,048 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>admin_only_start is NONE
2020-01-19 21:51:48,064 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>oms_only_start is FALSE
2020-01-19 21:51:48,064 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>ohs_start is FALSE
2020-01-19 21:51:48,079 [Thread-4] INFO  commands.BaseCommand run.605 - <ERR>log4j:WARN No appenders could be found for logger (emctl.secure.oms.AdminCredsWalletUtil).
2020-01-19 21:51:48,079 [Thread-4] INFO  commands.BaseCommand run.605 - <ERR>log4j:WARN Please initialize the log4j system properly.
2020-01-19 21:51:48,314 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>Connecting to Node Manager ...
2020-01-19 21:51:48,360 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT><2020-1-19 PM09时51分48秒 CST> <Info> <Security> <BEA-090905> <Disabling the CryptoJ JCE Provider self-integrity check for better startup performance. To enable this check, specify -Dweblogic.security.allowCryptoJDefaultJCEVerification=true.> 
2020-01-19 21:51:48,360 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT><2020-1-19 PM09时51分48秒 CST> <Info> <Security> <BEA-090906> <Changing the default Random Number Generator in RSA CryptoJ from ECDRBG128 to FIPS186PRNG. To disable this change, specify -Dweblogic.security.allowCryptoJDefaultPRNG=true.> 
2020-01-19 21:51:48,360 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT><2020-1-19 PM09时51分48秒 CST> <Info> <Security> <BEA-090909> <Using the configured custom SSL Hostname Verifier implementation: weblogic.security.utils.SSLWLSHostnameVerifier$NullHostnameVerifier.> 
2020-01-19 21:51:48,735 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>Successfully Connected to Node Manager.
2020-01-19 21:51:48,735 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>status of node manager:
2020-01-19 21:51:48,735 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>Currently connected to Node Manager to monitor the domain GCDomain.
2020-01-19 21:51:48,735 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>current status of EMGC_ADMINSERVER:
2020-01-19 21:51:48,735 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>
2020-01-19 21:51:48,735 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>RUNNING
2020-01-19 21:51:48,735 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>
2020-01-19 21:51:48,735 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>current status of BIP:
2020-01-19 21:51:48,751 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>
2020-01-19 21:51:48,751 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>RUNNING
2020-01-19 21:51:48,751 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>
2020-01-19 21:51:48,751 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>status of node manager:
2020-01-19 21:51:48,751 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>Currently connected to Node Manager to monitor the domain GCDomain.
2020-01-19 21:51:48,751 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>1
2020-01-19 21:51:48,751 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>Successfully disconnected from Node Manager.
2020-01-19 21:51:48,751 [Thread-3] INFO  commands.BaseCommand run.605 - <OUT>_END_
2020-01-19 21:51:48,751 [Thread-4] INFO  commands.BaseCommand run.605 - <ERR>_END_
2020-01-19 21:51:49,048 [main] ERROR commands.BaseCommand logAndPrint.651 - BI Publisher Server Already Started
2020-01-19 21:51:49,079 [main] INFO  commands.BaseCommand logAndPrint.653 - BI Publisher Server is Up
2020-01-19 21:51:49,079 [main] INFO  commands.StartCommand execute.209 - retCode from start BIP = 0

以上看来安装OEM也不是简单的一路Next就行的,没法子再回头仔细研读官方文档Cloud Control Basic Installation Guide.pdf与Cloud Control Advanced Installation and Configuration Guide.pdf,其中基于Windows平台的信息很少,但却都是很关键的几点:

上图中的1-4本质上都是权限的问题,都可以将安装用户加入到本地Administrators用户组去间接授权,但为了保险起见,还是添加一次。
文中以citrix\root用户为例,尽管我使用的root默认属于Domain Admis组,但似乎并没有起作用。

上述修改以后,建议reboot一次,虽然我这里没重启也顺利安装完成,但还是建议你重启一遍。

很多文章都没有介绍如何准备OEM的后台数据库,直接就开始安装了,实际上还有坑。本文以Oracle 19c RAC上的PDB数据库为例,官方实际上有OEM专用数据库模板,如18.1.0.0.0_Database_Template_for_EM13_3_0_0_0_Windows.zip,但比较复杂,不建议用。另外还有几项全局参数需要在全局修改:

适用于OEM Small环境:
alter system set parallel_max_servers=8 SCOPE=SPFILE;
alter system set session_cached_cursors=200 SCOPE=SPFILE;
alter system set sga_target=3000000000 SCOPE=SPFILE;
alter system set pga_aggregate_target=1000000000 SCOPE=SPFILE;
alter system set shared_pool_size='600000000';
alter system set "_allow_insert_with_update_check"=TRUE scope=both sid='*';

Medium环境:
alter system set parallel_max_servers=8 SCOPE=SPFILE;
alter system set session_cached_cursors=200 SCOPE=SPFILE;
alter system set sga_target=5000000000 SCOPE=SPFILE;
alter system set pga_aggregate_target=1340000000 SCOPE=SPFILE;
alter system set shared_pool_size='600000000';
alter system set "_allow_insert_with_update_check"=TRUE scope=both sid='*';

Large环境:
alter system set parallel_max_servers=8 SCOPE=SPFILE;
alter system set session_cached_cursors=200 SCOPE=SPFILE;
alter system set sga_target=8000000000 SCOPE=SPFILE;
alter system set pga_aggregate_target=1600000000 SCOPE=SPFILE;
alter system set shared_pool_size='600000000';
alter system set "_allow_insert_with_update_check"=TRUE scope=both sid='*';

参数贴入SQL端执行即可!

提前验证PDB数据库可联通性,然后开始配置OEM安装目录权限:创建好c:\app\middleware和agent两个同级folder,给app目录赋予full control权限。

开始正式安装,关键步骤截图如下:

选择高级安装
hostname自动变为FQDN
密码复杂度为大小写字母+数字
Service Name要加域名后缀,直接写PDB1会报错
这里一路OK即可
注意调用ASM Disk的格式为+DATA,并非普通的盘符路径
过了Start OMS这一阶段就没问题了!
Done!

详细的排错与安装过程见如下视频:

https://www.youtube.com/watch?v=ibjICTOu534
https://www.youtube.com/watch?v=ibjICTOu534