| 2023-03-08
      
      § | 
    
  | 10:36 | <nfraison> | restart namenode in an-master1002 to take in account new quota init threads setting | [analytics] | 
            
  | 10:25 | <nfraison> | failover namenode in prod from an-master1002-eqiad-wmnet to an-master1001-eqiad-wmnet | [analytics] | 
            
  | 09:59 | <nfraison> | restart namenode in an-master1001 (standby in prod) to take in account new quota init threads setting | [analytics] | 
            
  | 09:53 | <nfraison> | restart namenode in an-test-master1002 to take in account new quota init threads setting | [analytics] | 
            
  | 09:52 | <nfraison> | failover namenode in test from an-test-master1002-eqiad-wmnet to an-test-master1001-eqiad-wmnet | [analytics] | 
            
  | 09:47 | <nfraison> | restart namenode in an-test-master1001 to take in account new quota init threads setting | [analytics] | 
            
  | 09:36 | <nfraison> | restart test hiveserver2: T303168 | [analytics] | 
            
  | 09:13 | <nfraison> | restart prod resourcemanager to take in account new dedicated exclude file | [analytics] | 
            
  | 08:58 | <nfraison> | restart test resourcemanager to take in account new dedicated exclude file | [analytics] | 
            
  | 07:56 | <nfraison> | restart prod jobhistory to take in account: https://gerrit.wikimedia.org/r/c/operations/puppet/+/894481 | [analytics] | 
            
  | 07:47 | <nfraison> | restart test jobhistory to take in account: https://gerrit.wikimedia.org/r/c/operations/puppet/+/894481 | [analytics] | 
            
  
    | 2023-03-07
      
      § | 
    
  | 22:03 | <mforns> | deployed airflow analytics again to try and fix druid_load_edit_hourly | [analytics] | 
            
  | 16:55 | <xcollazo> | deployed image-suggestions hotfix to platform_eng Airflow instance. See https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/262. | [analytics] | 
            
  | 15:23 | <btullis> | re-enabling ingestion via gobblin. | [analytics] | 
            
  | 14:59 | <nfraison> | force startup of nodemanager on analytics_cluster | [analytics] | 
            
  | 14:58 | <btullis> | pooled druid1004 | [analytics] | 
            
  | 14:57 | <btullis> | pooling aqs1010 and aqs1016 | [analytics] | 
            
  | 14:56 | <btullis> | pooling datahubsearch1001 | [analytics] | 
            
  | 14:53 | <btullis> | leaving safe mode on hdfs | [analytics] | 
            
  | 13:59 | <btullis> | disabled puppet temporarily on an-master100[1-2] to avoid an automatic restart of yarn | [analytics] | 
            
  | 13:57 | <btullis> | stopped `hadoop-yarn-resourcemanager.service` on both an-master100[1-2] | [analytics] | 
            
  | 13:54 | <btullis> | entering safe mode with `sudo -u hdfs kerberos-run-command hdfs hadoop dfsadmin -safemode enter` on an-master1002 | [analytics] | 
            
  | 12:57 | <btullis> | depooled druid1004 for T329073 | [analytics] | 
            
  | 12:56 | <btullis> | depooled datahubsearch1001 for T329073 | [analytics] | 
            
  | 12:51 | <btullis> | disabled gobblin timers on an-launcher1002 | [analytics] | 
            
  | 12:46 | <btullis> | depooling aqs1016for T329073 | [analytics] | 
            
  | 12:45 | <btullis> | depooling aqs1010 for T329073 | [analytics] | 
            
  | 08:00 | <nfraison> | Reimage an-conf1003 to upgrade to bullseye T329362 | [analytics] |