1851-1900 of 3539 results (22ms)
2019-03-20 §
12:23 <arturo> last SAL entry is bogus, please ignore it [tools]
12:22 <arturo> depool and hard-reboot tools-sgeexec-0938.eqiad.wmflabs due to extreme load. It doesn't respond to ssh [tools]
12:11 <arturo> hard-reboot tools-sgeexec-0938.eqiad.wmflabs due to extreme load. It doesn't respond to ssh [tools]
10:10 <arturo> manually killing zombie procs in tools-sgewebgrid-lightttpd-0920 (T218546) [tools]
2019-03-19 §
13:56 <arturo> T218649 rebooting tools-sgecron-01 [tools]
2019-03-18 §
18:43 <bd808> Rebooting tools-static-12 [tools]
18:42 <chicocvenancio> PAWS: 3 nodes still in not ready state, `worker-10(01|07|10)` all else working [tools]
18:41 <chicocvenancio> PAWS: deleting pods stuck in Unknown state with ` --grace-period=0 --force` [tools]
18:40 <andrewbogott> rebooting tools-static-13 in hopes of fixing some nfs mounts [tools]
18:25 <chicocvenancio> removing postStart hook for PWB update and restarting hub while gerrit.wikimedia.com is down [tools]
2019-03-17 §
23:41 <bd808> Cherry-picked https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/497210/ as a quick fix for T218494 [tools]
22:30 <bd808> Investigating strange system state on tools-bastion-03. [tools]
17:48 <bstorm_> T218514 rebooting tools-worker-1009 and 1012 [tools]
17:46 <bstorm_> depooling tools-worker-1009 and tools-worker-1012 for T218514 [tools]
17:13 <bstorm_> depooled and rebooting tools-worker-1018 [tools]
15:09 <andrewbogott> running 'killall dpkg and dpkg --configure -a' on all nodes to try to work around a race with initramfs [tools]
2019-03-16 §
22:34 <bstorm_> clearing errored out queues again [tools]
2019-03-15 §
21:08 <bstorm_> cleared error state on several queues T217280 [tools]
15:58 <gtirloni> rebooted tools-clushmaster-02 [tools]
14:40 <mutante> tools-sgebastion-07 - dpkg-reconfigure locales and adding Korean ko_KR.EUC-KR - T130532 [tools]
14:32 <mutante> tools-sgebastion-07 - generating locales for user request in T130532 [tools]
2019-03-14 §
23:52 <bd808> Disabled job queues and rescheduled continuous jobs away from tools-exec-14{21,22,23,24,25,26,27,28,29,30,31,32} (T217152) [tools]
23:28 <bd808> Deleted tools-bastion-05 (T217152) [tools]
22:30 <bd808> Removed obsolete submit hosts from Trusty grid config [tools]
22:20 <bd808> Removed tools-webgrid-lighttpd-142{0,1,2,5} from the grid and shutdown instances via horizon (T217152) [tools]
22:10 <bd808> Depooled tools-webgrid-lighttpd-142{0,1,2,5} (T217152) [tools]
21:55 <bd808> Removed submit host flag from tools-bastion-05.tools.eqiad.wmflabs, removed floating ip, and shutdown instance via horizon (T217152) [tools]
21:48 <bd808> Removed tools-exec-14{33,34,35,36,37,38,39,40,41,42} from the grid and shutdown instances via horizon (T217152) [tools]
21:38 <gtirloni> rebooted tools-sgewebgrid-generic-0904 (T218341) [tools]
21:32 <gtirloni> rebooted tools-exec-1020 (T218341) [tools]
21:23 <gtirloni> rebooted tools-sgeexec-0919, tools-sgeexec-0934, tools-worker-1018 (T218341) [tools]
21:19 <bd808> Killed jobs still running on tools-exec-14{33,34,35,36,37,38,39,40,41,42}.tools.eqiad.wmflabs 2 weeks after being depooled (T217152) [tools]
20:58 <bd808> Repooled tools-sgeexec-0941 following reboot [tools]
20:57 <bd808> Hard reboot of tools-sgeexec-0941 via horizon [tools]
20:54 <bd808> Depooled and rebooted tools-sgeexec-0941.tools.eqiad.wmflabs [tools]
20:53 <bd808> Repooled tools-sgeexec-0917 following reboot [tools]
20:52 <bd808> Hard reboot of tools-sgeexec-0917 via horizon [tools]
20:47 <bd808> Depooled and rebooted tools-sgeexec-0917 [tools]
20:44 <bd808> Repooled tools-sgeexec-0908 after reboot [tools]
20:36 <bd808> depooled and rebooted tools-sgeexec-0908 [tools]
19:08 <gtirloni> rebooted tools-worker-1028 (T218341) [tools]
19:08 <gtirloni> rebooted tools-sgewebgrid-lighttpd-0914 (T218341) [tools]
19:07 <gtirloni> rebooted tools-sgewebgrid-lighttpd-0914 [tools]
18:13 <gtirloni> drained tools-worker-1028 for reboot (processes in D state) [tools]
2019-03-13 §
23:30 <bd808> Rebuilding stretch Kubernetes images [tools]
22:55 <bd808> Rebuilding jessie Kubernetes images [tools]
17:11 <bstorm_> specifically rebooted SGE cron server tools-sgecron-01 [tools]
17:10 <bstorm_> rebooted cron server [tools]
16:10 <bd808> Updated DNS for dev.tools.wmflabs.org to point to Stretch secondary bastion. This was missed on 2019-03-07 [tools]
12:33 <arturo> reboot tools-sgebastion-08 (T215154) [tools]