2021-03-09
§
|
22:00 |
<razzi> |
rebalance kafka partitions for webrequest_upload partition 14 |
[analytics] |
20:42 |
<elukey> |
reimaged an-worker1091 to buster |
[analytics] |
18:26 |
<elukey> |
reimage an-worker1087 to buster |
[analytics] |
16:40 |
<elukey> |
reimage analytics1077 to buster |
[analytics] |
15:36 |
<razzi> |
rebalance kafka partitions for webrequest_upload partition 13 |
[analytics] |
15:18 |
<elukey> |
reimage analytics1072 (hadoop hdfs journal node) to buster |
[analytics] |
14:29 |
<elukey> |
drain + reimage an-worker1090/89 to Buster |
[analytics] |
13:26 |
<elukey> |
reimage an-worker1102 and an-worker1080 (hdfs journal node) to Buster |
[analytics] |
12:59 |
<elukey> |
drain + reimage an-worker1103 to Buster |
[analytics] |
09:14 |
<elukey> |
drain + reimage analytics1076 and an-worker1112 to Buster |
[analytics] |
07:01 |
<elukey> |
drain + reimage an-worker109[4,5] to Buster |
[analytics] |
2021-03-08
§
|
23:22 |
<razzi> |
rebalance kafka partitions for webrequest_upload partition 12 |
[analytics] |
18:49 |
<razzi> |
rebalance kafka partitions for webrequest_upload partition 11 |
[analytics] |
18:11 |
<elukey> |
drain + reimage an-worker11[15,16] to Buster |
[analytics] |
17:12 |
<elukey> |
drain + reimage an-worker11[13,14] to Buster |
[analytics] |
16:17 |
<elukey> |
drain + reimage an-worker1109/1110 to Buster |
[analytics] |
14:54 |
<elukey> |
drain + reimage an-worker110[7,8] to Buster |
[analytics] |
14:52 |
<ottomata> |
altered topics (eqiad|codfw).mediawiki.client.session_tick to have 2 partitions - T276502 |
[analytics] |
13:51 |
<elukey> |
drain + reimage an-worker110[4,5] to Buster |
[analytics] |
10:41 |
<elukey> |
drain + reimage an-worker1104/1089 to Debian Buster |
[analytics] |
09:19 |
<elukey> |
drain + reimage an-worker108[3,4] to Buster |
[analytics] |
08:20 |
<elukey> |
drain + reimage an-worker108[1,2] to Buster |
[analytics] |
07:23 |
<elukey> |
drain + reimage analytics107[4,5] to Buster |
[analytics] |
2021-03-05
§
|
18:30 |
<razzi> |
run again sudo -i wmf-auto-reimage-host -p T269211 clouddb1021.eqiad.wmnet --new |
[analytics] |
18:18 |
<razzi> |
sudo cookbook sre.dns.netbox -t T269211 "Move clouddb1021 to private vlan" |
[analytics] |
18:17 |
<razzi> |
re-run interface_automation.ProvisionServerNetwork with private vlan |
[analytics] |
18:16 |
<razzi> |
delete non-mgmt interface for clouddb1021 |
[analytics] |
17:07 |
<razzi> |
sudo -i wmf-auto-reimage-host -p T269211 clouddb1021.eqiad.wmnet --new |
[analytics] |
16:54 |
<razzi> |
sudo cookbook sre.dns.netbox -t T269211 "Reimage and rename labsdb1012 to clouddb1021" |
[analytics] |
16:52 |
<razzi> |
run script at https://netbox.wikimedia.org/extras/scripts/interface_automation.ProvisionServerNetwork/ |
[analytics] |
16:47 |
<razzi> |
edit https://netbox.wikimedia.org/dcim/devices/2078/ device name from labsdb1012 to clouddb1021 |
[analytics] |
16:30 |
<razzi> |
delete non-mgmt interfaces for labsdb1012 at https://netbox.wikimedia.org/dcim/devices/2078/interfaces/ |
[analytics] |
16:28 |
<razzi> |
rename https://netbox.wikimedia.org/ipam/ip-addresses/734/ DNS name from labsdb1012.mgmt.eqiad.wmnet to clouddb1021.mgmt.eqiad.wmnet |
[analytics] |
16:08 |
<razzi> |
sudo cookbook sre.hosts.decommission labsdb1012.eqiad.wmnet -t T269211 |
[analytics] |
15:52 |
<razzi> |
stop mariadb on labsdb1012 |
[analytics] |
15:39 |
<razzi> |
rebalance kafka partitions for webrequest_upload partition 10 |
[analytics] |
15:07 |
<elukey> |
drain + reimage analytics1073 and an-worker1086 to Debian Buster |
[analytics] |
13:36 |
<elukey> |
roll restart HDFS Namenodes for the Hadoop cluster to pick up new Xmx settings (https://gerrit.wikimedia.org/r/c/operations/puppet/+/668659) |
[analytics] |
10:20 |
<elukey> |
force run of refinery-druid-drop-public-snapshots to check Druid public's performances |
[analytics] |
10:06 |
<elukey> |
failover HDFS Namenode from 1002 to 1001 (high GC pauses triggered the HDFS zkfc daemon on 1001 and the failover to 1002) |
[analytics] |
08:32 |
<elukey> |
drain + reimage an-worker107[8,9] to Debian Buster (one Journal node included) |
[analytics] |
07:22 |
<elukey> |
drain + reimage analytics107[0-1] to debian buster |
[analytics] |
07:13 |
<elukey> |
add analytis1066 back with /dev/sdb removed |
[analytics] |
07:01 |
<elukey> |
stop hadoop daemons on analytics1066 - disk errors on /dev/sdb after reimage |
[analytics] |