5001-5050 of 10000 results (53ms)
2021-09-24 ยง
15:53 <elukey@cumin1001> START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001 [production]
15:52 <elukey@cumin1001> END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons. - elukey@cumin1001 [production]
15:46 <elukey@cumin1001> START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons. - elukey@cumin1001 [production]
15:41 <dpifke> Cherry-picking https://gerrit.wikimedia.org/r/c/performance/coal/+/722948 and latest https://gerrit.wikimedia.org/r/c/operations/puppet/+/721047 in deployment-prep. Should only affect deployment-webperf11. [releng]
15:23 <elukey@cumin1001> END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001 [production]
15:17 <elukey@cumin1001> START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001 [production]
15:09 <elukey> sudo cumin -m async -b2 "c:profile::analytics::cluster::hdfs_mount" "umount /mnt/hdfs" "mount /mnt/hdfs" - T288625 [production]
15:06 <btullis> btullis@cumin1001:~$ sudo cumin --mode async 'aqs100[4,7].eqiad.wmnet' 'nodetool-a snapshot -t T291469' 'nodetool-b snapshot -t T291469' [analytics]
14:47 <btullis> btullis@aqs1007:~$ sudo nodetool-a repair --full local_group_default_T_mediarequest_per_file data [analytics]
14:46 <dcaro> Created new project (T290768) [wikiwho]
14:32 <bd808@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
14:21 <dcaro> Created new project (T290098) [fr-tech-dev]
14:07 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
14:03 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
13:31 <Amir1> start of rebuilding metadata of images in commons to make them use json [production]
13:24 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 [production]
13:02 <arturo> [codfw1dev] create VM manila-share-controller-01 on cloudinfra-codfw1dev [admin]
13:00 <arturo> [codfw1dev] rebase labs/private.git on cloudinfra-puppetmaster-01, had merge conflict [admin]
11:58 <effie> upgrading scap on canaries - T291095 [production]
11:39 <jiji@cumin1001> conftool action : set/pooled=true; selector: dnsdisc=tegola-vector-tiles [production]
11:32 <effie> uploading scap-4.0.0 to buster-wikimedia and stretch-wikimedia [production]
11:17 <effie> restart pybal in low traffic load balancers [production]
11:02 <btullis> btullis@an-master1001:~$ sudo systemctl restart hadoop-mapreduce-historyserver [analytics]
10:47 <btullis> btullis@an-master1002:~$ sudo systemctl restart hadoop-hdfs-namenode [analytics]
10:47 <btullis> btullis@an-master1002:~$ sudo systemctl restart hadoop-hdfs-zkfc [analytics]
10:44 <jynus> corrupting and fixing image metadata on testwiki before running script on commons T290462 [production]
10:35 <btullis> btullis@an-master1001:~$ sudo -u hdfs kerberos-run-command hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet [analytics]
10:16 <elukey@cumin1001> END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 [production]
10:11 <btullis@cumin1001> END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001 [production]
10:07 <btullis> btullis@an-launcher1002:~$ sudo -u analytics kerberos-run-command analytics /usr/local/bin/refine_eventlogging_legacy --ignore_failure_flag=true --table_include_regex='centralnoticeimpression' --since='2021-09-23T04:00:00.000Z' --until='2021-09-24T05:00:00.000Z' [analytics]
09:39 <jynus> upgrade and restart db2099 [production]
09:32 <btullis@cumin1001> START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001 [production]
09:29 <btullis@cumin1001> END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001 [production]
09:25 <marostegui> Rename flaggedimages on db1096(ruwiki) and db1098(arwiki) T290340 [production]
09:25 <elukey@cumin1001> START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 [production]
09:09 <jynus> upgrade and restart db2139, db2101 [production]
09:03 <btullis@cumin1001> START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001 [production]
08:35 <elukey@cumin1001> END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 [production]
08:22 <jynus> upgrade and restart db2098 T290868 [production]
08:20 <elukey@cumin1001> END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 [production]
08:08 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mx2002.wikimedia.org [production]
07:59 <jmm@cumin2002> START - Cookbook sre.hosts.decommission for hosts mx2002.wikimedia.org [production]
07:42 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mx1002.wikimedia.org [production]
07:34 <elukey@cumin1001> START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 [production]
07:17 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 [production]
07:11 <jmm@cumin2002> START - Cookbook sre.hosts.decommission for hosts mx1002.wikimedia.org [production]
07:01 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 [production]
07:01 <elukey@cumin1001> END (ERROR) - Cookbook sre.hadoop.roll-restart-workers (exit_code=97) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 [production]
07:00 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 [production]
06:55 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 [production]