commit 7a4dc3f8399780455a16a167068a03f317e1204d Author: Greg Kroah-Hartman Date: Wed Nov 21 09:26:04 2018 +0100 Linux 4.9.138 commit 7088c66504c236003cb2f138d9fec9f7c6076c40 Author: Mark Rutland Date: Wed Oct 17 17:42:10 2018 +0100 KVM: arm64: Fix caching of host MDCR_EL2 value commit da5a3ce66b8bb51b0ea8a89f42aac153903f90fb upstream. At boot time, KVM stashes the host MDCR_EL2 value, but only does this when the kernel is not running in hyp mode (i.e. is non-VHE). In these cases, the stashed value of MDCR_EL2.HPMN happens to be zero, which can lead to CONSTRAINED UNPREDICTABLE behaviour. Since we use this value to derive the MDCR_EL2 value when switching to/from a guest, after a guest have been run, the performance counters do not behave as expected. This has been observed to result in accesses via PMXEVTYPER_EL0 and PMXEVCNTR_EL0 not affecting the relevant counters, resulting in events not being counted. In these cases, only the fixed-purpose cycle counter appears to work as expected. Fix this by always stashing the host MDCR_EL2 value, regardless of VHE. Cc: Christopher Dall Cc: James Morse Cc: Will Deacon Cc: stable@vger.kernel.org Fixes: 1e947bad0b63b351 ("arm64: KVM: Skip HYP setup when already running in HYP") Tested-by: Robin Murphy Signed-off-by: Mark Rutland Signed-off-by: Marc Zyngier Signed-off-by: Greg Kroah-Hartman commit cc5bd86e271d97d8b29ba7524074600344a4505a Author: Chris Wilson Date: Thu Nov 8 08:17:38 2018 +0000 drm/i915/execlists: Force write serialisation into context image vs execution commit 0a823e8fd4fd67726697854578f3584ee3a49b1d upstream. Ensure that the writes into the context image are completed prior to the register mmio to trigger execution. Although previously we were assured by the SDM that all writes are flushed before an uncached memory transaction (our mmio write to submit the context to HW for execution), we have empirical evidence to believe that this is not actually the case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656 References: https://bugs.freedesktop.org/show_bug.cgi?id=108315 References: https://bugs.freedesktop.org/show_bug.cgi?id=106887 Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Tvrtko Ursulin Acked-by: Mika Kuoppala Link: https://patchwork.freedesktop.org/patch/msgid/20181108081740.25615-1-chris@chris-wilson.co.uk Cc: stable@vger.kernel.org (cherry picked from commit 987abd5c62f92ee4970b45aa077f47949974e615) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 232ed06fd179b1be758dfbc53a491df5aeb374f7 Author: Clint Taylor Date: Thu Oct 25 11:52:00 2018 -0700 drm/i915/hdmi: Add HDMI 2.0 audio clock recovery N values commit 6503493145cba4413ecd3d4d153faeef4a1e9b85 upstream. HDMI 2.0 594Mhz modes were incorrectly selecting 25.200Mhz Automatic N value mode instead of HDMI specification values. V2: Fix 88.2 Hz N value Cc: Jani Nikula Cc: stable@vger.kernel.org Signed-off-by: Clint Taylor Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/1540493521-1746-2-git-send-email-clinton.a.taylor@intel.com (cherry picked from commit 5a400aa3c562c4a726b4da286e63c96db905ade1) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 9c926f10eaabd60a35eece8c769ed40cab304a20 Author: Stanislav Lisovskiy Date: Fri Nov 9 11:00:12 2018 +0200 drm/dp_mst: Check if primary mstb is null commit 23d8003907d094f77cf959228e2248d6db819fa7 upstream. Unfortunately drm_dp_get_mst_branch_device which is called from both drm_dp_mst_handle_down_rep and drm_dp_mst_handle_up_rep seem to rely on that mgr->mst_primary is not NULL, which seem to be wrong as it can be cleared with simultaneous mode set, if probing fails or in other case. mgr->lock mutex doesn't protect against that as it might just get assigned to NULL right before, not simultaneously. There are currently bugs 107738, 108616 bugs which crash in drm_dp_get_mst_branch_device, caused by this issue. v2: Refactored the code, as it was nicely noticed. Fixed Bugzilla bug numbers(second was 108616, but not 108816) and added links. [changed title and added stable cc] Signed-off-by: Lyude Paul Signed-off-by: Stanislav Lisovskiy Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108616 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107738 Link: https://patchwork.freedesktop.org/patch/msgid/20181109090012.24438-1-stanislav.lisovskiy@intel.com Signed-off-by: Greg Kroah-Hartman commit fd2038380e2d3e5e5fd280abb2896f58e27e0591 Author: Marc Zyngier Date: Sun Aug 5 13:48:07 2018 +0100 drm/rockchip: Allow driver to be shutdown on reboot/kexec commit 7f3ef5dedb146e3d5063b6845781ad1bb59b92b5 upstream. Leaving the DRM driver enabled on reboot or kexec has the annoying effect of leaving the display generating transactions whilst the IOMMU has been shut down. In turn, the IOMMU driver (which shares its interrupt line with the VOP) starts warning either on shutdown or when entering the secondary kernel in the kexec case (nothing is expected on that front). A cheap way of ensuring that things are nicely shut down is to register a shutdown callback in the platform driver. Signed-off-by: Marc Zyngier Tested-by: Vicente Bergas Signed-off-by: Heiko Stuebner Link: https://patchwork.freedesktop.org/patch/msgid/20180805124807.18169-1-marc.zyngier@arm.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 9c34ad0ce3b31aedf2134d3b28eab59f6e860ef3 Author: Mike Kravetz Date: Fri Oct 5 15:51:29 2018 -0700 mm: migration: fix migration of huge PMD shared pages commit 017b1660df89f5fb4bfe66c34e35f7d2031100c7 upstream. The page migration code employs try_to_unmap() to try and unmap the source page. This is accomplished by using rmap_walk to find all vmas where the page is mapped. This search stops when page mapcount is zero. For shared PMD huge pages, the page map count is always 1 no matter the number of mappings. Shared mappings are tracked via the reference count of the PMD page. Therefore, try_to_unmap stops prematurely and does not completely unmap all mappings of the source page. This problem can result is data corruption as writes to the original source page can happen after contents of the page are copied to the target page. Hence, data is lost. This problem was originally seen as DB corruption of shared global areas after a huge page was soft offlined due to ECC memory errors. DB developers noticed they could reproduce the issue by (hotplug) offlining memory used to back huge pages. A simple testcase can reproduce the problem by creating a shared PMD mapping (note that this must be at least PUD_SIZE in size and PUD_SIZE aligned (1GB on x86)), and using migrate_pages() to migrate process pages between nodes while continually writing to the huge pages being migrated. To fix, have the try_to_unmap_one routine check for huge PMD sharing by calling huge_pmd_unshare for hugetlbfs huge pages. If it is a shared mapping it will be 'unshared' which removes the page table entry and drops the reference on the PMD page. After this, flush caches and TLB. mmu notifiers are called before locking page tables, but we can not be sure of PMD sharing until page tables are locked. Therefore, check for the possibility of PMD sharing before locking so that notifiers can prepare for the worst possible case. Link: http://lkml.kernel.org/r/20180823205917.16297-2-mike.kravetz@oracle.com [mike.kravetz@oracle.com: make _range_in_vma() a static inline] Link: http://lkml.kernel.org/r/6063f215-a5c8-2f0c-465a-2c515ddc952d@oracle.com Fixes: 39dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz Acked-by: Kirill A. Shutemov Reviewed-by: Naoya Horiguchi Acked-by: Michal Hocko Cc: Vlastimil Babka Cc: Davidlohr Bueso Cc: Jerome Glisse Cc: Mike Kravetz Cc: Signed-off-by: Andrew Morton Signed-off-by: Mike Kravetz Reviewed-by: Jérôme Glisse Acked-by: Michal Hocko Signed-off-by: Greg Kroah-Hartman commit f8d4c943f2cf3bc0346b70bd599050c4db920703 Author: Mike Kravetz Date: Fri Nov 16 15:08:04 2018 -0800 hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444! commit 5e41540c8a0f0e98c337dda8b391e5dda0cde7cf upstream. This bug has been experienced several times by the Oracle DB team. The BUG is in remove_inode_hugepages() as follows: /* * If page is mapped, it was faulted in after being * unmapped in caller. Unmap (again) now after taking * the fault mutex. The mutex will prevent faults * until we finish removing the page. * * This race can only happen in the hole punch case. * Getting here in a truncate operation is a bug. */ if (unlikely(page_mapped(page))) { BUG_ON(truncate_op); In this case, the elevated map count is not the result of a race. Rather it was incorrectly incremented as the result of a bug in the huge pmd sharing code. Consider the following: - Process A maps a hugetlbfs file of sufficient size and alignment (PUD_SIZE) that a pmd page could be shared. - Process B maps the same hugetlbfs file with the same size and alignment such that a pmd page is shared. - Process B then calls mprotect() to change protections for the mapping with the shared pmd. As a result, the pmd is 'unshared'. - Process B then calls mprotect() again to chage protections for the mapping back to their original value. pmd remains unshared. - Process B then forks and process C is created. During the fork process, we do dup_mm -> dup_mmap -> copy_page_range to copy page tables. Copying page tables for hugetlb mappings is done in the routine copy_hugetlb_page_range. In copy_hugetlb_page_range(), the destination pte is obtained by: dst_pte = huge_pte_alloc(dst, addr, sz); If pmd sharing is possible, the returned pointer will be to a pte in an existing page table. In the situation above, process C could share with either process A or process B. Since process A is first in the list, the returned pte is a pointer to a pte in process A's page table. However, the check for pmd sharing in copy_hugetlb_page_range is: /* If the pagetables are shared don't copy or take references */ if (dst_pte == src_pte) continue; Since process C is sharing with process A instead of process B, the above test fails. The code in copy_hugetlb_page_range which follows assumes dst_pte points to a huge_pte_none pte. It copies the pte entry from src_pte to dst_pte and increments this map count of the associated page. This is how we end up with an elevated map count. To solve, check the dst_pte entry for huge_pte_none. If !none, this implies PMD sharing so do not copy. Link: http://lkml.kernel.org/r/20181105212315.14125-1-mike.kravetz@oracle.com Fixes: c5c99429fa57 ("fix hugepages leak due to pagetable page sharing") Signed-off-by: Mike Kravetz Reviewed-by: Naoya Horiguchi Cc: Michal Hocko Cc: Hugh Dickins Cc: Andrea Arcangeli Cc: "Kirill A . Shutemov" Cc: Davidlohr Bueso Cc: Prakash Sangappa Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit e133c33edf3b63193fceb58a5d5249eed63cb572 Author: Arnd Bergmann Date: Fri Nov 16 15:08:35 2018 -0800 lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn commit 1c23b4108d716cc848b38532063a8aca4f86add8 upstream. gcc-8 complains about the prototype for this function: lib/ubsan.c:432:1: error: ignoring attribute 'noreturn' in declaration of a built-in function '__ubsan_handle_builtin_unreachable' because it conflicts with attribute 'const' [-Werror=attributes] This is actually a GCC's bug. In GCC internals __ubsan_handle_builtin_unreachable() declared with both 'noreturn' and 'const' attributes instead of only 'noreturn': https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84210 Workaround this by removing the noreturn attribute. [aryabinin: add information about GCC bug in changelog] Link: http://lkml.kernel.org/r/20181107144516.4587-1-aryabinin@virtuozzo.com Signed-off-by: Arnd Bergmann Signed-off-by: Andrey Ryabinin Acked-by: Olof Johansson Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 8f1756ad8ba4881c9d8c263718c9e220e00e3e01 Author: Guenter Roeck Date: Sun Jul 1 13:56:54 2018 -0700 configfs: replace strncpy with memcpy commit 1823342a1f2b47a4e6f5667f67cd28ab6bc4d6cd upstream. gcc 8.1.0 complains: fs/configfs/symlink.c:67:3: warning: 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length fs/configfs/symlink.c: In function 'configfs_get_link': fs/configfs/symlink.c:63:13: note: length computed here Using strncpy() is indeed less than perfect since the length of data to be copied has already been determined with strlen(). Replace strncpy() with memcpy() to address the warning and optimize the code a little. Signed-off-by: Guenter Roeck Signed-off-by: Christoph Hellwig Signed-off-by: Nobuhiro Iwamatsu Signed-off-by: Greg Kroah-Hartman commit d180feedae04deda20380e4af7eba94e6b416f3c Author: Miklos Szeredi Date: Fri Nov 9 15:52:16 2018 +0100 fuse: fix leaked notify reply commit 7fabaf303458fcabb694999d6fa772cc13d4e217 upstream. fuse_request_send_notify_reply() may fail if the connection was reset for some reason (e.g. fs was unmounted). Don't leak request reference in this case. Besides leaking memory, this resulted in fc->num_waiting not being decremented and hence fuse_wait_aborted() left in a hanging and unkillable state. Fixes: 2d45ba381a74 ("fuse: add retrieve request") Fixes: b8f95e5d13f5 ("fuse: umount should wait for all requests") Reported-and-tested-by: syzbot+6339eda9cb4ebbc4c37b@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi Cc: #v2.6.36 Signed-off-by: Greg Kroah-Hartman commit fdd93795b825fcab0a4f045b626f46de837703b4 Author: Lukas Czerner Date: Fri Nov 9 14:51:46 2018 +0100 fuse: fix use-after-free in fuse_direct_IO() commit ebacb81273599555a7a19f7754a1451206a5fc4f upstream. In async IO blocking case the additional reference to the io is taken for it to survive fuse_aio_complete(). In non blocking case this additional reference is not needed, however we still reference io to figure out whether to wait for completion or not. This is wrong and will lead to use-after-free. Fix it by storing blocking information in separate variable. This was spotted by KASAN when running generic/208 fstest. Signed-off-by: Lukas Czerner Reported-by: Zorro Lang Signed-off-by: Miklos Szeredi Fixes: 744742d692e3 ("fuse: Add reference counting for fuse_io_priv") Cc: # v4.6 Signed-off-by: Greg Kroah-Hartman commit c616a9326b235e6531a129f1bd68f7767607302e Author: Maciej W. Rozycki Date: Mon Nov 5 03:48:25 2018 +0000 rtc: hctosys: Add missing range error reporting commit 7ce9a992ffde8ce93d5ae5767362a5c7389ae895 upstream. Fix an issue with the 32-bit range error path in `rtc_hctosys' where no error code is set and consequently the successful preceding call result from `rtc_read_time' is propagated to `rtc_hctosys_ret'. This in turn makes any subsequent call to `hctosys_show' incorrectly report in sysfs that the system time has been set from this RTC while it has not. Set the error to ERANGE then if we can't express the result due to an overflow. Signed-off-by: Maciej W. Rozycki Fixes: b3a5ac42ab18 ("rtc: hctosys: Ensure system time doesn't overflow time_t") Cc: stable@vger.kernel.org # 4.17+ Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit 7291d95a97fc89044301b197c760555e894e82c7 Author: Scott Mayhew Date: Thu Nov 8 11:11:36 2018 -0500 nfsd: COPY and CLONE operations require the saved filehandle to be set commit 01310bb7c9c98752cc763b36532fab028e0f8f81 upstream. Make sure we have a saved filehandle, otherwise we'll oops with a null pointer dereference in nfs4_preprocess_stateid_op(). Signed-off-by: Scott Mayhew Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields Signed-off-by: Greg Kroah-Hartman commit 4d9d47c7e5bf72067263b13d2bed9f9bffe45325 Author: Frank Sorenson Date: Tue Oct 30 15:10:40 2018 -0500 sunrpc: correct the computation for page_ptr when truncating commit 5d7a5bcb67c70cbc904057ef52d3fcfeb24420bb upstream. When truncating the encode buffer, the page_ptr is getting advanced, causing the next page to be skipped while encoding. The page is still included in the response, so the response contains a page of bogus data. We need to adjust the page_ptr backwards to ensure we encode the next page into the correct place. We saw this triggered when concurrent directory modifications caused nfsd4_encode_direct_fattr() to return nfserr_noent, and the resulting call to xdr_truncate_encode() corrupted the READDIR reply. Signed-off-by: Frank Sorenson Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields Signed-off-by: Greg Kroah-Hartman commit 03004f416351c3f6760fa577fb9024847f7f5f48 Author: Eric W. Biederman Date: Thu Oct 25 12:05:11 2018 -0500 mount: Prevent MNT_DETACH from disconnecting locked mounts commit 9c8e0a1b683525464a2abe9fb4b54404a50ed2b4 upstream. Timothy Baldwin wrote: > As per mount_namespaces(7) unprivileged users should not be able to look under mount points: > > Mounts that come as a single unit from more privileged mount are locked > together and may not be separated in a less privileged mount namespace. > > However they can: > > 1. Create a mount namespace. > 2. In the mount namespace open a file descriptor to the parent of a mount point. > 3. Destroy the mount namespace. > 4. Use the file descriptor to look under the mount point. > > I have reproduced this with Linux 4.16.18 and Linux 4.18-rc8. > > The setup: > > $ sudo sysctl kernel.unprivileged_userns_clone=1 > kernel.unprivileged_userns_clone = 1 > $ mkdir -p A/B/Secret > $ sudo mount -t tmpfs hide A/B > > > "Secret" is indeed hidden as expected: > > $ ls -lR A > A: > total 0 > drwxrwxrwt 2 root root 40 Feb 12 21:08 B > > A/B: > total 0 > > > The attack revealing "Secret": > > $ unshare -Umr sh -c "exec unshare -m ls -lR /proc/self/fd/4/ 4 /proc/self/fd/4/: > total 0 > drwxr-xr-x 3 root root 60 Feb 12 21:08 B > > /proc/self/fd/4/B: > total 0 > drwxr-xr-x 2 root root 40 Feb 12 21:08 Secret > > /proc/self/fd/4/B/Secret: > total 0 I tracked this down to put_mnt_ns running passing UMOUNT_SYNC and disconnecting all of the mounts in a mount namespace. Fix this by factoring drop_mounts out of drop_collected_mounts and passing 0 instead of UMOUNT_SYNC. There are two possible behavior differences that result from this. - No longer setting UMOUNT_SYNC will no longer set MNT_SYNC_UMOUNT on the vfsmounts being unmounted. This effects the lazy rcu walk by kicking the walk out of rcu mode and forcing it to be a non-lazy walk. - No longer disconnecting locked mounts will keep some mounts around longer as they stay because the are locked to other mounts. There are only two users of drop_collected mounts: audit_tree.c and put_mnt_ns. In audit_tree.c the mounts are private and there are no rcu lazy walks only calls to iterate_mounts. So the changes should have no effect except for a small timing effect as the connected mounts are disconnected. In put_mnt_ns there may be references from process outside the mount namespace to the mounts. So the mounts remaining connected will be the bug fix that is needed. That rcu walks are allowed to continue appears not to be a problem especially as the rcu walk change was about an implementation detail not about semantics. Cc: stable@vger.kernel.org Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Reported-by: Timothy Baldwin Tested-by: Timothy Baldwin Signed-off-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit d7565180923ee289a6b68456df46ad9c15eaff9d Author: Eric W. Biederman Date: Thu Oct 25 09:04:18 2018 -0500 mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts commit df7342b240185d58d3d9665c0bbf0a0f5570ec29 upstream. Jonathan Calmels from NVIDIA reported that he's able to bypass the mount visibility security check in place in the Linux kernel by using a combination of the unbindable property along with the private mount propagation option to allow a unprivileged user to see a path which was purposefully hidden by the root user. Reproducer: # Hide a path to all users using a tmpfs root@castiana:~# mount -t tmpfs tmpfs /sys/devices/ root@castiana:~# # As an unprivileged user, unshare user namespace and mount namespace stgraber@castiana:~$ unshare -U -m -r # Confirm the path is still not accessible root@castiana:~# ls /sys/devices/ # Make /sys recursively unbindable and private root@castiana:~# mount --make-runbindable /sys root@castiana:~# mount --make-private /sys # Recursively bind-mount the rest of /sys over to /mnnt root@castiana:~# mount --rbind /sys/ /mnt # Access our hidden /sys/device as an unprivileged user root@castiana:~# ls /mnt/devices/ breakpoint cpu cstate_core cstate_pkg i915 intel_pt isa kprobe LNXSYSTM:00 msr pci0000:00 platform pnp0 power software system tracepoint uncore_arb uncore_cbox_0 uncore_cbox_1 uprobe virtual Solve this by teaching copy_tree to fail if a mount turns out to be both unbindable and locked. Cc: stable@vger.kernel.org Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Reported-by: Jonathan Calmels Signed-off-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit fa14e9bddc7d0ca7a2b0073286ed86f343ae771d Author: Eric W. Biederman Date: Mon Oct 22 10:21:38 2018 -0500 mount: Retest MNT_LOCKED in do_umount commit 25d202ed820ee347edec0bf3bf553544556bf64b upstream. It was recently pointed out that the one instance of testing MNT_LOCKED outside of the namespace_sem is in ksys_umount. Fix that by adding a test inside of do_umount with namespace_sem and the mount_lock held. As it helps to fail fails the existing test is maintained with an additional comment pointing out that it may be racy because the locks are not held. Cc: stable@vger.kernel.org Reported-by: Al Viro Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Signed-off-by: "Eric W. Biederman" Signed-off-by: Greg Kroah-Hartman commit d450fcdb42e72939f42d8bb696546b31d6b89204 Author: Vasily Averin Date: Wed Nov 7 22:36:23 2018 -0500 ext4: fix buffer leak in __ext4_read_dirblock() on error path commit de59fae0043f07de5d25e02ca360f7d57bfa5866 upstream. Fixes: dc6982ff4db1 ("ext4: refactor code to read directory blocks ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 3.9 Signed-off-by: Greg Kroah-Hartman commit 82dfeb8d8254f213c8d86fe03458c94590560f90 Author: Vasily Averin Date: Wed Nov 7 11:10:21 2018 -0500 ext4: fix buffer leak in ext4_xattr_move_to_block() on error path commit 6bdc9977fcdedf47118d2caf7270a19f4b6d8a8f upstream. Fixes: 3f2571c1f91f ("ext4: factor out xattr moving") Fixes: 6dd4ee7cab7e ("ext4: Expand extra_inodes space per ...") Reviewed-by: Jan Kara Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 2.6.23 Signed-off-by: Greg Kroah-Hartman commit b66102a457de28cc58033de0eaaff8f1153621a1 Author: Vasily Averin Date: Wed Nov 7 11:07:01 2018 -0500 ext4: release bs.bh before re-using in ext4_xattr_block_find() commit 45ae932d246f721e6584430017176cbcadfde610 upstream. bs.bh was taken in previous ext4_xattr_block_find() call, it should be released before re-using Fixes: 7e01c8e5420b ("ext3/4: fix uninitialized bs in ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 2.6.26 Signed-off-by: Greg Kroah-Hartman commit 6b436fb67e1fe5f8a1839145e68001492467fca0 Author: Vasily Averin Date: Wed Nov 7 10:56:28 2018 -0500 ext4: fix possible leak of s_journal_flag_rwsem in error path commit af18e35bfd01e6d65a5e3ef84ffe8b252d1628c5 upstream. Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 4.7 Signed-off-by: Greg Kroah-Hartman commit 8547ff5d14c2317f3a8719d39be5df0162da062e Author: Theodore Ts'o Date: Wed Nov 7 10:32:53 2018 -0500 ext4: fix possible leak of sbi->s_group_desc_leak in error path commit 9e463084cdb22e0b56b2dfbc50461020409a5fd3 upstream. Fixes: bfe0a5f47ada ("ext4: add more mount time checks of the superblock") Reported-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 4.18 Signed-off-by: Greg Kroah-Hartman commit 11a2eb02e6d1efa664b2b52e9da5e2b6c8f28eb4 Author: Theodore Ts'o Date: Tue Nov 6 17:18:17 2018 -0500 ext4: avoid possible double brelse() in add_new_gdb() on error path commit 4f32c38b4662312dd3c5f113d8bdd459887fb773 upstream. Fixes: b40971426a83 ("ext4: add error checking to calls to ...") Reported-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 2.6.38 Signed-off-by: Greg Kroah-Hartman commit 142e0172e303708548844462fe70e276d739bac3 Author: Vasily Averin Date: Tue Nov 6 16:16:01 2018 -0500 ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizing commit f348e2241fb73515d65b5d77dd9c174128a7fbf2 upstream. Fixes: 117fff10d7f1 ("ext4: grow the s_flex_groups array as needed ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 3.7 Signed-off-by: Greg Kroah-Hartman commit f30a52c9c5fa6a83cdc38666eac593f94b60712f Author: Vasily Averin Date: Tue Nov 6 17:01:36 2018 -0500 ext4: avoid buffer leak in ext4_orphan_add() after prior errors commit feaf264ce7f8d54582e2f66eb82dd9dd124c94f3 upstream. Fixes: d745a8c20c1f ("ext4: reduce contention on s_orphan_lock") Fixes: 6e3617e579e0 ("ext4: Handle non empty on-disk orphan link") Cc: Dmitry Monakhov Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 2.6.34 Signed-off-by: Greg Kroah-Hartman commit 8a6a7dd7449dc0b3404ead3fb04ff57292d6f11f Author: Vasily Averin Date: Tue Nov 6 16:20:40 2018 -0500 ext4: fix possible inode leak in the retry loop of ext4_resize_fs() commit db6aee62406d9fbb53315fcddd81f1dc271d49fa upstream. Fixes: 1c6bd7173d66 ("ext4: convert file system to meta_bg if needed ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 3.7 Signed-off-by: Greg Kroah-Hartman commit 05821678f265d5019bea438b3b760e1fef0b56cc Author: Vasily Averin Date: Sat Nov 3 16:13:17 2018 -0400 ext4: avoid potential extra brelse in setup_new_flex_group_blocks() commit 9e4028935cca3f9ef9b6a90df9da6f1f94853536 upstream. Currently bh is set to NULL only during first iteration of for cycle, then this pointer is not cleared after end of using. Therefore rollback after errors can lead to extra brelse(bh) call, decrements bh counter and later trigger an unexpected warning in __brelse() Patch moves brelse() calls in body of cycle to exclude requirement of brelse() call in rollback. Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 3.3+ Signed-off-by: Greg Kroah-Hartman commit c484fd258ccaa4e45c72bed5eb06a3237495279c Author: Vasily Averin Date: Sat Nov 3 16:50:08 2018 -0400 ext4: add missing brelse() add_new_gdb_meta_bg()'s error path commit 61a9c11e5e7a0dab5381afa5d9d4dd5ebf18f7a0 upstream. Fixes: 01f795f9e0d6 ("ext4: add online resizing support for meta_bg ...") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 3.7 Signed-off-by: Greg Kroah-Hartman commit 771e8c73f26ff36ee4f281506743dde5a5507ab0 Author: Vasily Averin Date: Sat Nov 3 16:22:10 2018 -0400 ext4: add missing brelse() in set_flexbg_block_bitmap()'s error path commit cea5794122125bf67559906a0762186cf417099c upstream. Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...") Cc: stable@kernel.org # 3.3 Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit b1341999bd39696cee67138e9e20ecacee59517c Author: Vasily Averin Date: Sat Nov 3 17:11:19 2018 -0400 ext4: add missing brelse() update_backups()'s error path commit ea0abbb648452cdb6e1734b702b6330a7448fcf8 upstream. Fixes: ac27a0ec112a ("ext4: initial copy of files from ext3") Signed-off-by: Vasily Averin Signed-off-by: Theodore Ts'o Cc: stable@kernel.org # 2.6.19 Signed-off-by: Greg Kroah-Hartman commit f6939dbd8071a95d536c4c92a5a0d1426a9e8fd3 Author: Michael Kelley Date: Sun Nov 4 03:48:54 2018 +0000 clockevents/drivers/i8253: Add support for PIT shutdown quirk commit 35b69a420bfb56b7b74cb635ea903db05e357bec upstream. Add support for platforms where pit_shutdown() doesn't work because of a quirk in the PIT emulation. On these platforms setting the counter register to zero causes the PIT to start running again, negating the shutdown. Provide a global variable that controls whether the counter register is zero'ed, which platform specific code can override. Signed-off-by: Michael Kelley Signed-off-by: Thomas Gleixner Cc: "gregkh@linuxfoundation.org" Cc: "devel@linuxdriverproject.org" Cc: "daniel.lezcano@linaro.org" Cc: "virtualization@lists.linux-foundation.org" Cc: "jgross@suse.com" Cc: "akataria@vmware.com" Cc: "olaf@aepfle.de" Cc: "apw@canonical.com" Cc: vkuznets Cc: "jasowang@redhat.com" Cc: "marcelo.cerri@canonical.com" Cc: KY Srinivasan Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1541303219-11142-2-git-send-email-mikelley@microsoft.com Signed-off-by: Greg Kroah-Hartman commit 800d8112ee7f55688b54abcd84855525e389cca0 Author: Filipe Manana Date: Mon Nov 5 11:14:17 2018 +0000 Btrfs: fix data corruption due to cloning of eof block commit ac765f83f1397646c11092a032d4f62c3d478b81 upstream. We currently allow cloning a range from a file which includes the last block of the file even if the file's size is not aligned to the block size. This is fine and useful when the destination file has the same size, but when it does not and the range ends somewhere in the middle of the destination file, it leads to corruption because the bytes between the EOF and the end of the block have undefined data (when there is support for discard/trimming they have a value of 0x00). Example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ export foo_size=$((256 * 1024 + 100)) $ xfs_io -f -c "pwrite -S 0x3c 0 $foo_size" /mnt/foo $ xfs_io -f -c "pwrite -S 0xb5 0 1M" /mnt/bar $ xfs_io -c "reflink /mnt/foo 0 512K $foo_size" /mnt/bar $ od -A d -t x1 /mnt/bar 0000000 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 * 0524288 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c * 0786528 3c 3c 3c 3c 00 00 00 00 00 00 00 00 00 00 00 00 0786544 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 0790528 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 * 1048576 The bytes in the range from 786532 (512Kb + 256Kb + 100 bytes) to 790527 (512Kb + 256Kb + 4Kb - 1) got corrupted, having now a value of 0x00 instead of 0xb5. This is similar to the problem we had for deduplication that got recently fixed by commit de02b9f6bb65 ("Btrfs: fix data corruption when deduplicating between different files"). Fix this by not allowing such operations to be performed and return the errno -EINVAL to user space. This is what XFS is doing as well at the VFS level. This change however now makes us return -EINVAL instead of -EOPNOTSUPP for cases where the source range maps to an inline extent and the destination range's end is smaller then the destination file's size, since the detection of inline extents is done during the actual process of dropping file extent items (at __btrfs_drop_extents()). Returning the -EINVAL error is done early on and solely based on the input parameters (offsets and length) and destination file's size. This makes us consistent with XFS and anyone else supporting cloning since this case is now checked at a higher level in the VFS and is where the -EINVAL will be returned from starting with kernel 4.20 (the VFS changed was introduced in 4.20-rc1 by commit 07d19dc9fbe9 ("vfs: avoid problematic remapping requests into partial EOF block"). So this change is more geared towards stable kernels, as it's unlikely the new VFS checks get removed intentionally. A test case for fstests follows soon, as well as an update to filter existing tests that expect -EOPNOTSUPP to accept -EINVAL as well. CC: # 4.4+ Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 3fe6b9aa5cd26375176627fae5331a87a28927bd Author: Robbie Ko Date: Tue Oct 30 18:04:04 2018 +0800 Btrfs: fix cur_offset in the error case for nocow commit 506481b20e818db40b6198815904ecd2d6daee64 upstream. When the cow_file_range fails, the related resources are unlocked according to the range [start..end), so the unlock cannot be repeated in run_delalloc_nocow. In some cases (e.g. cur_offset <= end && cow_start != -1), cur_offset is not updated correctly, so move the cur_offset update before cow_file_range. kernel BUG at mm/page-writeback.c:2663! Internal error: Oops - BUG: 0 [#1] SMP CPU: 3 PID: 31525 Comm: kworker/u8:7 Tainted: P O Hardware name: Realtek_RTD1296 (DT) Workqueue: writeback wb_workfn (flush-btrfs-1) task: ffffffc076db3380 ti: ffffffc02e9ac000 task.ti: ffffffc02e9ac000 PC is at clear_page_dirty_for_io+0x1bc/0x1e8 LR is at clear_page_dirty_for_io+0x14/0x1e8 pc : [] lr : [] pstate: 40000145 sp : ffffffc02e9af4f0 Process kworker/u8:7 (pid: 31525, stack limit = 0xffffffc02e9ac020) Call trace: [] clear_page_dirty_for_io+0x1bc/0x1e8 [] extent_clear_unlock_delalloc+0x1e4/0x210 [btrfs] [] run_delalloc_nocow+0x3b8/0x948 [btrfs] [] run_delalloc_range+0x250/0x3a8 [btrfs] [] writepage_delalloc.isra.21+0xbc/0x1d8 [btrfs] [] __extent_writepage+0xe8/0x248 [btrfs] [] extent_write_cache_pages.isra.17+0x164/0x378 [btrfs] [] extent_writepages+0x48/0x68 [btrfs] [] btrfs_writepages+0x20/0x30 [btrfs] [] do_writepages+0x30/0x88 [] __writeback_single_inode+0x34/0x198 [] writeback_sb_inodes+0x184/0x3c0 [] __writeback_inodes_wb+0x6c/0xc0 [] wb_writeback+0x1b8/0x1c0 [] wb_workfn+0x150/0x250 [] process_one_work+0x1dc/0x388 [] worker_thread+0x130/0x500 [] kthread+0x10c/0x110 [] ret_from_fork+0x10/0x40 Code: d503201f a9025bb5 a90363b7 f90023b9 (d4210000) CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana Signed-off-by: Robbie Ko Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 53111ab29e0cf303b6f294f035da1458067287f1 Author: H. Peter Anvin (Intel) Date: Mon Oct 22 09:19:05 2018 -0700 arch/alpha, termios: implement BOTHER, IBSHIFT and termios2 commit d0ffb805b729322626639336986bc83fc2e60871 upstream. Alpha has had c_ispeed and c_ospeed, but still set speeds in c_cflags using arbitrary flags. Because BOTHER is not defined, the general Linux code doesn't allow setting arbitrary baud rates, and because CBAUDEX == 0, we can have an array overrun of the baud_rate[] table in drivers/tty/tty_baudrate.c if (c_cflags & CBAUD) == 037. Resolve both problems by #defining BOTHER to 037 on Alpha. However, userspace still needs to know if setting BOTHER is actually safe given legacy kernels (does anyone actually care about that on Alpha anymore?), so enable the TCGETS2/TCSETS*2 ioctls on Alpha, even though they use the same structure. Define struct termios2 just for compatibility; it is the exact same structure as struct termios. In a future patchset, this will be cleaned up so the uapi headers are usable from libc. Signed-off-by: H. Peter Anvin (Intel) Cc: Jiri Slaby Cc: Al Viro Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Matt Turner Cc: Thomas Gleixner Cc: Kate Stewart Cc: Philippe Ombredanne Cc: Eugene Syromiatnikov Cc: Cc: Cc: Johan Hovold Cc: Alan Cox Cc: Signed-off-by: Greg Kroah-Hartman commit 4d18bea3de8efa7b6816344f18a71b43f472aa64 Author: H. Peter Anvin Date: Mon Oct 22 09:19:04 2018 -0700 termios, tty/tty_baudrate.c: fix buffer overrun commit 991a25194097006ec1e0d2e0814ff920e59e3465 upstream. On architectures with CBAUDEX == 0 (Alpha and PowerPC), the code in tty_baudrate.c does not do any limit checking on the tty_baudrate[] array, and in fact a buffer overrun is possible on both architectures. Add a limit check to prevent that situation. This will be followed by a much bigger cleanup/simplification patch. Signed-off-by: H. Peter Anvin (Intel) Requested-by: Cc: Johan Hovold Cc: Jiri Slaby Cc: Al Viro Cc: Richard Henderson Cc: Ivan Kokshaysky Cc: Matt Turner Cc: Thomas Gleixner Cc: Kate Stewart Cc: Philippe Ombredanne Cc: Eugene Syromiatnikov Cc: Alan Cox Cc: stable Signed-off-by: Greg Kroah-Hartman commit af86cb901c7f48fb3b4b66091437d4d5af4d5189 Author: John Garry Date: Thu Nov 8 18:17:03 2018 +0800 of, numa: Validate some distance map rules commit 89c38422e072bb453e3045b8f1b962a344c3edea upstream. Currently the NUMA distance map parsing does not validate the distance table for the distance-matrix rules 1-2 in [1]. However the arch NUMA code may enforce some of these rules, but not all. Such is the case for the arm64 port, which does not enforce the rule that the distance between separates nodes cannot equal LOCAL_DISTANCE. The patch adds the following rules validation: - distance of node to self equals LOCAL_DISTANCE - distance of separate nodes > LOCAL_DISTANCE This change avoids a yet-unresolved crash reported in [2]. A note on dealing with symmetrical distances between nodes: Validating symmetrical distances between nodes is difficult. If it were mandated in the bindings that every distance must be recorded in the table, then it would be easy. However, it isn't. In addition to this, it is also possible to record [b, a] distance only (and not [a, b]). So, when processing the table for [b, a], we cannot assert that current distance of [a, b] != [b, a] as invalid, as [a, b] distance may not be present in the table and current distance would be default at REMOTE_DISTANCE. As such, we maintain the policy that we overwrite distance [a, b] = [b, a] for b > a. This policy is different to kernel ACPI SLIT validation, which allows non-symmetrical distances (ACPI spec SLIT rules allow it). However, the distance debug message is dropped as it may be misleading (for a distance which is later overwritten). Some final notes on semantics: - It is implied that it is the responsibility of the arch NUMA code to reset the NUMA distance map for an error in distance map parsing. - It is the responsibility of the FW NUMA topology parsing (whether OF or ACPI) to enforce NUMA distance rules, and not arch NUMA code. [1] Documents/devicetree/bindings/numa.txt [2] https://www.spinics.net/lists/arm-kernel/msg683304.html Cc: stable@vger.kernel.org # 4.7 Signed-off-by: John Garry Acked-by: Will Deacon Signed-off-by: Rob Herring Signed-off-by: Greg Kroah-Hartman commit d29a7b6f720dff56c6eaa676b91cedf25e6a6343 Author: Arnd Bergmann Date: Thu Oct 11 13:06:16 2018 +0200 mtd: docg3: don't set conflicting BCH_CONST_PARAMS option commit be2e1c9dcf76886a83fb1c433a316e26d4ca2550 upstream. I noticed during the creation of another bugfix that the BCH_CONST_PARAMS option that is set by DOCG3 breaks setting variable parameters for any other users of the BCH library code. The only other user we have today is the MTD_NAND software BCH implementation (most flash controllers use hardware BCH these days and are not affected). I considered removing BCH_CONST_PARAMS entirely because of the inherent conflict, but according to the description in lib/bch.c there is a significant performance benefit in keeping it. To avoid the immediate problem of the conflict between MTD_NAND_BCH and DOCG3, this only sets the constant parameters if MTD_NAND_BCH is disabled, which should fix the problem for all cases that are affected. This should also work for all stable kernels. Note that there is only one machine that actually seems to use the DOCG3 driver (arch/arm/mach-pxa/mioa701.c), so most users should have the driver disabled, but it almost certainly shows up if we wanted to test random kernels on machines that use software BCH in MTD. Fixes: d13d19ece39f ("mtd: docg3: add ECC correction code") Cc: stable@vger.kernel.org Cc: Robert Jarzmik Signed-off-by: Arnd Bergmann Signed-off-by: Boris Brezillon Signed-off-by: Greg Kroah-Hartman commit 7e1e1956dcc7c391ca48a4c278b56134ea47b7e9 Author: Vasily Khoruzhick Date: Thu Oct 25 12:15:43 2018 -0700 netfilter: conntrack: fix calculation of next bucket number in early_drop commit f393808dc64149ccd0e5a8427505ba2974a59854 upstream. If there's no entry to drop in bucket that corresponds to the hash, early_drop() should look for it in other buckets. But since it increments hash instead of bucket number, it actually looks in the same bucket 8 times: hsize is 16k by default (14 bits) and hash is 32-bit value, so reciprocal_scale(hash, hsize) returns the same value for hash..hash+7 in most cases. Fix it by increasing bucket number instead of hash and rename _hash to bucket to avoid future confusion. Fixes: 3e86638e9a0b ("netfilter: conntrack: consider ct netns in early_drop logic") Cc: # v4.7+ Signed-off-by: Vasily Khoruzhick Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 818e584636d7ddaa1dec897de2506a73c33ebb38 Author: Andrea Arcangeli Date: Fri Nov 2 15:47:59 2018 -0700 mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings commit ac5b2c18911ffe95c08d69273917f90212cf5659 upstream. THP allocation might be really disruptive when allocated on NUMA system with the local node full or hard to reclaim. Stefan has posted an allocation stall report on 4.12 based SLES kernel which suggests the same issue: kvm: page allocation stalls for 194572ms, order:9, mode:0x4740ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null) kvm cpuset=/ mems_allowed=0-1 CPU: 10 PID: 84752 Comm: kvm Tainted: G W 4.12.0+98-ph 0000001 SLE15 (unreleased) Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 2.0 12/05/2017 Call Trace: dump_stack+0x5c/0x84 warn_alloc+0xe0/0x180 __alloc_pages_slowpath+0x820/0xc90 __alloc_pages_nodemask+0x1cc/0x210 alloc_pages_vma+0x1e5/0x280 do_huge_pmd_wp_page+0x83f/0xf00 __handle_mm_fault+0x93d/0x1060 handle_mm_fault+0xc6/0x1b0 __do_page_fault+0x230/0x430 do_page_fault+0x2a/0x70 page_fault+0x7b/0x80 [...] Mem-Info: active_anon:126315487 inactive_anon:1612476 isolated_anon:5 active_file:60183 inactive_file:245285 isolated_file:0 unevictable:15657 dirty:286 writeback:1 unstable:0 slab_reclaimable:75543 slab_unreclaimable:2509111 mapped:81814 shmem:31764 pagetables:370616 bounce:0 free:32294031 free_pcp:6233 free_cma:0 Node 0 active_anon:254680388kB inactive_anon:1112760kB active_file:240648kB inactive_file:981168kB unevictable:13368kB isolated(anon):0kB isolated(file):0kB mapped:280240kB dirty:1144kB writeback:0kB shmem:95832kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 81225728kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no Node 1 active_anon:250583072kB inactive_anon:5337144kB active_file:84kB inactive_file:0kB unevictable:49260kB isolated(anon):20kB isolated(file):0kB mapped:47016kB dirty:0kB writeback:4kB shmem:31224kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 31897600kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no The defrag mode is "madvise" and from the above report it is clear that the THP has been allocated for MADV_HUGEPAGA vma. Andrea has identified that the main source of the problem is __GFP_THISNODE usage: : The problem is that direct compaction combined with the NUMA : __GFP_THISNODE logic in mempolicy.c is telling reclaim to swap very : hard the local node, instead of failing the allocation if there's no : THP available in the local node. : : Such logic was ok until __GFP_THISNODE was added to the THP allocation : path even with MPOL_DEFAULT. : : The idea behind the __GFP_THISNODE addition, is that it is better to : provide local memory in PAGE_SIZE units than to use remote NUMA THP : backed memory. That largely depends on the remote latency though, on : threadrippers for example the overhead is relatively low in my : experience. : : The combination of __GFP_THISNODE and __GFP_DIRECT_RECLAIM results in : extremely slow qemu startup with vfio, if the VM is larger than the : size of one host NUMA node. This is because it will try very hard to : unsuccessfully swapout get_user_pages pinned pages as result of the : __GFP_THISNODE being set, instead of falling back to PAGE_SIZE : allocations and instead of trying to allocate THP on other nodes (it : would be even worse without vfio type1 GUP pins of course, except it'd : be swapping heavily instead). Fix this by removing __GFP_THISNODE for THP requests which are requesting the direct reclaim. This effectivelly reverts 5265047ac301 on the grounds that the zone/node reclaim was known to be disruptive due to premature reclaim when there was memory free. While it made sense at the time for HPC workloads without NUMA awareness on rare machines, it was ultimately harmful in the majority of cases. The existing behaviour is similar, if not as widespare as it applies to a corner case but crucially, it cannot be tuned around like zone_reclaim_mode can. The default behaviour should always be to cause the least harm for the common case. If there are specialised use cases out there that want zone_reclaim_mode in specific cases, then it can be built on top. Longterm we should consider a memory policy which allows for the node reclaim like behavior for the specific memory ranges which would allow a [1] http://lkml.kernel.org/r/20180820032204.9591-1-aarcange@redhat.com Mel said: : Both patches look correct to me but I'm responding to this one because : it's the fix. The change makes sense and moves further away from the : severe stalling behaviour we used to see with both THP and zone reclaim : mode. : : I put together a basic experiment with usemem configured to reference a : buffer multiple times that is 80% the size of main memory on a 2-socket : box with symmetric node sizes and defrag set to "always". The defrag : setting is not the default but it would be functionally similar to : accessing a buffer with madvise(MADV_HUGEPAGE). Usemem is configured to : reference the buffer multiple times and while it's not an interesting : workload, it would be expected to complete reasonably quickly as it fits : within memory. The results were; : : usemem : vanilla noreclaim-v1 : Amean Elapsd-1 42.78 ( 0.00%) 26.87 ( 37.18%) : Amean Elapsd-3 27.55 ( 0.00%) 7.44 ( 73.00%) : Amean Elapsd-4 5.72 ( 0.00%) 5.69 ( 0.45%) : : This shows the elapsed time in seconds for 1 thread, 3 threads and 4 : threads referencing buffers 80% the size of memory. With the patches : applied, it's 37.18% faster for the single thread and 73% faster with two : threads. Note that 4 threads showing little difference does not indicate : the problem is related to thread counts. It's simply the case that 4 : threads gets spread so their workload mostly fits in one node. : : The overall view from /proc/vmstats is more startling : : 4.19.0-rc1 4.19.0-rc1 : vanillanoreclaim-v1r1 : Minor Faults 35593425 708164 : Major Faults 484088 36 : Swap Ins 3772837 0 : Swap Outs 3932295 0 : : Massive amounts of swap in/out without the patch : : Direct pages scanned 6013214 0 : Kswapd pages scanned 0 0 : Kswapd pages reclaimed 0 0 : Direct pages reclaimed 4033009 0 : : Lots of reclaim activity without the patch : : Kswapd efficiency 100% 100% : Kswapd velocity 0.000 0.000 : Direct efficiency 67% 100% : Direct velocity 11191.956 0.000 : : Mostly from direct reclaim context as you'd expect without the patch. : : Page writes by reclaim 3932314.000 0.000 : Page writes file 19 0 : Page writes anon 3932295 0 : Page reclaim immediate 42336 0 : : Writes from reclaim context is never good but the patch eliminates it. : : We should never have default behaviour to thrash the system for such a : basic workload. If zone reclaim mode behaviour is ever desired but on a : single task instead of a global basis then the sensible option is to build : a mempolicy that enforces that behaviour. This was a severe regression compared to previous kernels that made important workloads unusable and it starts when __GFP_THISNODE was added to THP allocations under MADV_HUGEPAGE. It is not a significant risk to go to the previous behavior before __GFP_THISNODE was added, it worked like that for years. This was simply an optimization to some lucky workloads that can fit in a single node, but it ended up breaking the VM for others that can't possibly fit in a single node, so going back is safe. [mhocko@suse.com: rewrote the changelog based on the one from Andrea] Link: http://lkml.kernel.org/r/20180925120326.24392-2-mhocko@kernel.org Fixes: 5265047ac301 ("mm, thp: really limit transparent hugepage allocation to local node") Signed-off-by: Andrea Arcangeli Signed-off-by: Michal Hocko Reported-by: Stefan Priebe Debugged-by: Andrea Arcangeli Reported-by: Alex Williamson Reviewed-by: Mel Gorman Tested-by: Mel Gorman Cc: Zi Yan Cc: Vlastimil Babka Cc: David Rientjes Cc: "Kirill A. Shutemov" Cc: [4.1+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit dd4c84ba2b3a20ac1d05d1b1c579e5cb6b12df13 Author: Changwei Ge Date: Fri Nov 2 15:48:15 2018 -0700 ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry commit 29aa30167a0a2e6045a0d6d2e89d8168132333d5 upstream. Somehow, file system metadata was corrupted, which causes ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el(). According to the original design intention, if above happens we should skip the problematic block and continue to retrieve dir entry. But there is obviouse misuse of brelse around related code. After failure of ocfs2_check_dir_entry(), current code just moves to next position and uses the problematic buffer head again and again during which the problematic buffer head is released for multiple times. I suppose, this a serious issue which is long-lived in ocfs2. This may cause other file systems which is also used in a the same host insane. So we should also consider about bakcporting this patch into linux -stable. Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.com Signed-off-by: Changwei Ge Suggested-by: Changkuo Shi Reviewed-by: Andrew Morton Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Joseph Qi Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit acdc6a723ea5d249093d927e2e958ae5c26baa11 Author: Greg Edwards Date: Wed Aug 22 13:21:53 2018 -0600 vhost/scsi: truncate T10 PI iov_iter to prot_bytes commit 4542d623c7134bc1738f8a68ccb6dd546f1c264f upstream. Commands with protection information included were not truncating the protection iov_iter to the number of protection bytes in the command. This resulted in vhost_scsi mis-calculating the size of the protection SGL in vhost_scsi_calc_sgls(), and including both the protection and data SG entries in the protection SGL. Fixes: 09b13fa8c1a1 ("vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq") Signed-off-by: Greg Edwards Signed-off-by: Michael S. Tsirkin Fixes: 09b13fa8c1a1093e9458549ac8bb203a7c65c62a Cc: stable@vger.kernel.org Reviewed-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit e200279bb82795e6da3a23ab36550e54a3602bea Author: Gustavo A. R. Silva Date: Wed Jul 25 19:47:19 2018 -0500 reset: hisilicon: fix potential NULL pointer dereference commit e9a2310fb689151166df7fd9971093362d34bd79 upstream. There is a potential execution path in which function platform_get_resource() returns NULL. If this happens, we will end up having a NULL pointer dereference. Fix this by replacing devm_ioremap with devm_ioremap_resource, which has the NULL check and the memory region request. This code was detected with the help of Coccinelle. Cc: stable@vger.kernel.org Fixes: 97b7129cd2af ("reset: hisilicon: change the definition of hisi_reset_init") Signed-off-by: Gustavo A. R. Silva Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit 7ecf24877ea8c25625c904ac30f051c6a4bc7878 Author: Mikulas Patocka Date: Mon Oct 8 12:57:35 2018 +0200 mach64: fix image corruption due to reading accelerator registers commit c09bcc91bb94ed91f1391bffcbe294963d605732 upstream. Reading the registers without waiting for engine idle returns unpredictable values. These unpredictable values result in display corruption - if atyfb_imageblit reads the content of DP_PIX_WIDTH with the bit DP_HOST_TRIPLE_EN set (from previous invocation), the driver would never ever clear the bit, resulting in display corruption. We don't want to wait for idle because it would degrade performance, so this patch modifies the driver so that it never reads accelerator registers. HOST_CNTL doesn't have to be read, we can just write it with HOST_BYTE_ALIGN because no other part of the driver cares if HOST_BYTE_ALIGN is set. DP_PIX_WIDTH is written in the functions atyfb_copyarea and atyfb_fillrect with the default value and in atyfb_imageblit with the value set according to the source image data. Signed-off-by: Mikulas Patocka Reviewed-by: Ville Syrjälä Cc: stable@vger.kernel.org Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Greg Kroah-Hartman commit 6bd9b55243e6caafec1d245c475d6dc4d842e6c4 Author: Mikulas Patocka Date: Mon Oct 8 12:57:34 2018 +0200 mach64: fix display corruption on big endian machines commit 3c6c6a7878d00a3ac997a779c5b9861ff25dfcc8 upstream. The code for manual bit triple is not endian-clean. It builds the variable "hostdword" using byte accesses, therefore we must read the variable with "le32_to_cpu". The patch also enables (hardware or software) bit triple only if the image is monochrome (image->depth). If we want to blit full-color image, we shouldn't use the triple code. Signed-off-by: Mikulas Patocka Reviewed-by: Ville Syrjälä Cc: stable@vger.kernel.org Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Greg Kroah-Hartman commit 43e38372f9b1d7a79c6f9e6c18fd5d5e3ca07693 Author: Yan, Zheng Date: Thu Sep 27 21:16:05 2018 +0800 Revert "ceph: fix dentry leak in splice_dentry()" commit efe328230dc01aa0b1269aad0b5fae73eea4677a upstream. This reverts commit 8b8f53af1ed9df88a4c0fbfdf3db58f62060edf3. splice_dentry() is used by three places. For two places, req->r_dentry is passed to splice_dentry(). In the case of error, req->r_dentry does not get updated. So splice_dentry() should not drop reference. Cc: stable@vger.kernel.org # 4.18+ Signed-off-by: "Yan, Zheng" Signed-off-by: Ilya Dryomov Signed-off-by: Greg Kroah-Hartman commit 9efe04470a973c01cea272f795117cf28f3383ec Author: Ilya Dryomov Date: Wed Sep 26 18:03:16 2018 +0200 libceph: bump CEPH_MSG_MAX_DATA_LEN commit 94e6992bb560be8bffb47f287194adf070b57695 upstream. If the read is large enough, we end up spinning in the messenger: libceph: osd0 192.168.122.1:6801 io error libceph: osd0 192.168.122.1:6801 io error libceph: osd0 192.168.122.1:6801 io error This is a receive side limit, so only reads were affected. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov Signed-off-by: Greg Kroah-Hartman commit ee27421c80df335ffef85d9f45c37890a3a8ccb4 Author: Enric Balletbo i Serra Date: Tue Oct 16 15:41:44 2018 +0200 clk: rockchip: Fix static checker warning in rockchip_ddrclk_get_parent call commit 665636b2940d0897c4130253467f5e8c42eea392 upstream. Fixes the signedness bug returning '(-22)' on the return type by removing the sanity checker in rockchip_ddrclk_get_parent(). The function should return and unsigned value only and it's safe to remove the sanity checker as the core functions that call get_parent like clk_core_get_parent_by_index already ensures the validity of the clk index returned (index >= core->num_parents). Fixes: a4f182bf81f18 ("clk: rockchip: add new clock-type for the ddrclk") Cc: stable@vger.kernel.org Signed-off-by: Enric Balletbo i Serra Reviewed-by: Stephen Boyd Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 79a3e118b016bf6a64a5e52ca32ecc2e2c1def0f Author: Ronald Wahl Date: Wed Oct 10 15:54:54 2018 +0200 clk: at91: Fix division by zero in PLL recalc_rate() commit 0f5cb0e6225cae2f029944cb8c74617aab6ddd49 upstream. Commit a982e45dc150 ("clk: at91: PLL recalc_rate() now using cached MUL and DIV values") removed a check that prevents a division by zero. This now causes a stacktrace when booting the kernel on a at91 platform if the PLL DIV register contains zero. This commit reintroduces this check. Fixes: a982e45dc150 ("clk: at91: PLL recalc_rate() now using cached...") Cc: Signed-off-by: Ronald Wahl Acked-by: Ludovic Desroches Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit 0a468686faef8c9cb2d9eb9d0ab1d84ece7c4d06 Author: Krzysztof Kozlowski Date: Wed Aug 29 21:20:10 2018 +0200 clk: s2mps11: Fix matching when built as module and DT node contains compatible commit 8985167ecf57f97061599a155bb9652c84ea4913 upstream. When driver is built as module and DT node contains clocks compatible (e.g. "samsung,s2mps11-clk"), the module will not be autoloaded because module aliases won't match. The modalias from uevent: of:NclocksTCsamsung,s2mps11-clk The modalias from driver: platform:s2mps11-clk The devices are instantiated by parent's MFD. However both Device Tree bindings and parent define the compatible for clocks devices. In case of module matching this DT compatible will be used. The issue will not happen if this is a built-in (no need for module matching) or when clocks DT node does not contain compatible (not correct from bindings perspective but working for driver). Note when backporting to stable kernels: adjust the list of device ID entries. Cc: Fixes: 53c31b3437a6 ("mfd: sec-core: Add of_compatible strings for clock MFD cells") Signed-off-by: Krzysztof Kozlowski Acked-by: Stephen Boyd Signed-off-by: Stephen Boyd Signed-off-by: Greg Kroah-Hartman commit 0dac028156c654724415971a86812fd2574f1601 Author: Max Filippov Date: Tue Nov 13 23:46:42 2018 -0800 xtensa: fix boot parameters address translation commit 40dc948f234b73497c3278875eb08a01d5854d3f upstream. The bootloader may pass physical address of the boot parameters structure to the MMUv3 kernel in the register a2. Code in the _SetupMMU block in the arch/xtensa/kernel/head.S is supposed to map that physical address to the virtual address in the configured virtual memory layout. This code haven't been updated when additional 256+256 and 512+512 memory layouts were introduced and it may produce wrong addresses when used with these layouts. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov Signed-off-by: Greg Kroah-Hartman commit a10aa331aa02e0140e81b31a97b258dae764efcc Author: Max Filippov Date: Sun Nov 4 01:46:00 2018 -0700 xtensa: make sure bFLT stack is 16 byte aligned commit 0773495b1f5f1c5e23551843f87b5ff37e7af8f7 upstream. Xtensa ABI requires stack alignment to be at least 16. In noMMU configuration ARCH_SLAB_MINALIGN is used to align stack. Make it at least 16. This fixes the following runtime error in noMMU configuration, caused by interaction between insufficiently aligned stack and alloca function, that results in corruption of on-stack variable in the libc function glob: Caught unhandled exception in 'sh' (pid = 47, pc = 0x02d05d65) - should not happen EXCCAUSE is 15 Cc: stable@vger.kernel.org Signed-off-by: Max Filippov Signed-off-by: Greg Kroah-Hartman commit c9226141c2e59fa705f48dc1fa15e36558bae632 Author: Max Filippov Date: Mon Oct 29 18:30:13 2018 -0700 xtensa: add NOTES section to the linker script commit 4119ba211bc4f1bf638f41e50b7a0f329f58aa16 upstream. This section collects all source .note.* sections together in the vmlinux image. Without it .note.Linux section may be placed at address 0, while the rest of the kernel is at its normal address, resulting in a huge vmlinux.bin image that may not be linked into the xtensa Image.elf. Cc: stable@vger.kernel.org Signed-off-by: Max Filippov Signed-off-by: Greg Kroah-Hartman commit fca93d34f3f9bcdf8662d83ca8b8c4f4cf459a7a Author: Huacai Chen Date: Wed Sep 5 17:33:09 2018 +0800 MIPS: Loongson-3: Fix BRIDGE irq delivery problem [ Upstream commit 360fe725f8849aaddc53475fef5d4a0c439b05ae ] After commit e509bd7da149dc349160 ("genirq: Allow migration of chained interrupts by installing default action") Loongson-3 fails at here: setup_irq(LOONGSON_HT1_IRQ, &cascade_irqaction); This is because both chained_action and cascade_irqaction don't have IRQF_SHARED flag. This will cause Loongson-3 resume fails because HPET timer interrupt can't be delivered during S3. So we set the irqchip of the chained irq to loongson_irq_chip which doesn't disable the chained irq in CP0.Status. Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen Signed-off-by: Paul Burton Patchwork: https://patchwork.linux-mips.org/patch/20434/ Cc: Ralf Baechle Cc: James Hogan Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang Cc: Zhangjin Wu Cc: Huacai Chen Signed-off-by: Sasha Levin commit 83babf64c1c12279386a2d8eee1582eca74a6c10 Author: Huacai Chen Date: Wed Sep 5 17:33:08 2018 +0800 MIPS: Loongson-3: Fix CPU UART irq delivery problem [ Upstream commit d06f8a2f1befb5a3d0aa660ab1c05e9b744456ea ] Masking/unmasking the CPU UART irq in CP0_Status (and redirecting it to other CPUs) may cause interrupts be lost, especially in multi-package machines (Package-0's UART irq cannot be delivered to others). So make mask_loongson_irq() and unmask_loongson_irq() be no-ops. The original problem (UART IRQ may deliver to any core) is also because of masking/unmasking the CPU UART irq in CP0_Status. So it is safe to remove all of the stuff. Signed-off-by: Huacai Chen Signed-off-by: Paul Burton Patchwork: https://patchwork.linux-mips.org/patch/20433/ Cc: Ralf Baechle Cc: James Hogan Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang Cc: Zhangjin Wu Cc: Huacai Chen Signed-off-by: Sasha Levin commit b7706a824257acbe6bfbdf602765fb60a5963d68 Author: Helge Deller Date: Sun Oct 14 21:58:00 2018 +0200 parisc: Fix exported address of os_hpmc handler [ Upstream commit 99a3ae51d557d8e38a7aece65678a31f9db215ee ] In the C-code we need to put the physical address of the hpmc handler in the interrupt vector table (IVA) in order to get HPMCs working. Since on parisc64 function pointers are indirect (in fact they are function descriptors) we instead export the address as variable and not as function. This reverts a small part of commit f39cce654f9a ("parisc: Add cfi_startproc and cfi_endproc to assembly code"). Signed-off-by: Helge Deller Cc: [4.9+] Signed-off-by: Sasha Levin commit 5a41c652134b6c55442b67dc5acc4b0eee56883f Author: Helge Deller Date: Sat Mar 24 21:18:25 2018 +0100 parisc: Fix HPMC handler by increasing size to multiple of 16 bytes [ Upstream commit d5654e156bc4d68a87bbaa6d7e020baceddf6e68 ] Make sure that the HPMC (High Priority Machine Check) handler is 16-byte aligned and that it's length in the IVT is a multiple of 16 bytes. Otherwise PDC may decide not to call the HPMC crash handler. Signed-off-by: Helge Deller Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 4d01ed83361cd01012b35413aac2711f18932fed Author: Helge Deller Date: Tue Dec 12 21:25:41 2017 +0100 parisc: Align os_hpmc_size on word boundary [ Upstream commit 0ed9d3de5f8f97e6efd5ca0e3377cab5f0451ead ] The os_hpmc_size variable sometimes wasn't aligned at word boundary and thus triggered the unaligned fault handler at startup. Fix it by aligning it properly. Signed-off-by: Helge Deller Cc: # v4.14+ Signed-off-by: Sasha Levin commit 3d67d62931d9471b860ba7a732d5961da1320a7e Author: Kees Cook Date: Fri May 5 15:30:23 2017 -0700 bna: ethtool: Avoid reading past end of buffer [ Upstream commit 4dc69c1c1fff2f587f8e737e70b4a4e7565a5c94 ] Using memcpy() from a string that is shorter than the length copied means the destination buffer is being filled with arbitrary data from the kernel rodata segment. Instead, use strncpy() which will fill the trailing bytes with zeros. This was found with the future CONFIG_FORTIFY_SOURCE feature. Cc: Daniel Micay Signed-off-by: Kees Cook Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit e029b8379ed64f5beaddbf53c87873246df300ff Author: Vincenzo Maffione Date: Sat Sep 16 18:00:00 2017 +0200 e1000: fix race condition between e1000_down() and e1000_watchdog [ Upstream commit 44c445c3d1b4eacff23141fa7977c3b2ec3a45c9 ] This patch fixes a race condition that can result into the interface being up and carrier on, but with transmits disabled in the hardware. The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which allows e1000_watchdog() interleave with e1000_down(). CPU x CPU y -------------------------------------------------------------------- e1000_down(): netif_carrier_off() e1000_watchdog(): if (carrier == off) { netif_carrier_on(); enable_hw_transmit(); } disable_hw_transmit(); e1000_watchdog(): /* carrier on, do nothing */ Signed-off-by: Vincenzo Maffione Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin commit 55edbf1ce1bf834a9df977a721ad4a9250d15b1f Author: Colin Ian King Date: Fri Sep 22 18:13:48 2017 +0100 e1000: avoid null pointer dereference on invalid stat type [ Upstream commit 5983587c8c5ef00d6886477544ad67d495bc5479 ] Currently if the stat type is invalid then data[i] is being set either by dereferencing a null pointer p, or it is reading from an incorrect previous location if we had a valid stat type previously. Fix this by skipping over the read of p on an invalid stat type. Detected by CoverityScan, CID#113385 ("Explicit null dereferenced") Signed-off-by: Colin Ian King Reviewed-by: Alexander Duyck Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin commit 1a55a71ed205ccd1546fff8b433652c4d5933550 Author: Michal Hocko Date: Tue Nov 13 16:41:56 2018 +0000 mm: do not bug_on on incorrect length in __mm_populate() commit bb177a732c4369bb58a1fe1df8f552b6f0f7db5f upstream. syzbot has noticed that a specially crafted library can easily hit VM_BUG_ON in __mm_populate kernel BUG at mm/gup.c:1242! invalid opcode: 0000 [#1] SMP CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 RIP: 0010:__mm_populate+0x1e2/0x1f0 Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb Call Trace: vm_brk_flags+0xc3/0x100 vm_brk+0x1f/0x30 load_elf_library+0x281/0x2e0 __ia32_sys_uselib+0x170/0x1e0 do_fast_syscall_32+0xca/0x420 entry_SYSENTER_compat+0x70/0x7f The reason is that the length of the new brk is not page aligned when we try to populate the it. There is no reason to bug on that though. do_brk_flags already aligns the length properly so the mapping is expanded as it should. All we need is to tell mm_populate about it. Besides that there is absolutely no reason to to bug_on in the first place. The worst thing that could happen is that the last page wouldn't get populated and that is far from putting system into an inconsistent state. Fix the issue by moving the length sanitization code from do_brk_flags up to vm_brk_flags. The only other caller of do_brk_flags is brk syscall entry and it makes sure to provide the proper length so t here is no need for sanitation and so we can use do_brk_flags without it. Also remove the bogus BUG_ONs. [osalvador@techadventures.net: fix up vm_brk_flags s@request@len@] Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.cz Signed-off-by: Michal Hocko Reported-by: syzbot Tested-by: Tetsuo Handa Reviewed-by: Oscar Salvador Cc: Zi Yan Cc: "Aneesh Kumar K.V" Cc: Dan Williams Cc: "Kirill A. Shutemov" Cc: Michael S. Tsirkin Cc: Al Viro Cc: "Huang, Ying" Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [bwh: Backported to 4.9: - There is no do_brk_flags() function; update do_brk() - Adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Sasha Levin commit c31ccf1fe8c707ff2c078370c7c87ed794d0a39a Author: Miklos Szeredi Date: Fri Sep 28 16:43:22 2018 +0200 fuse: set FR_SENT while locked commit 4c316f2f3ff315cb48efb7435621e5bfb81df96d upstream. Otherwise fuse_dev_do_write() could come in and finish off the request, and the set_bit(FR_SENT, ...) could trigger the WARN_ON(test_bit(FR_SENT, ...)) in request_end(). Signed-off-by: Miklos Szeredi Reported-by: syzbot+ef054c4d3f64cd7f7cec@syzkaller.appspotmai Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts") Cc: # v4.2 Signed-off-by: Greg Kroah-Hartman commit 7238fcbd68147fd76a927bfd10cb377d032374bf Author: Miklos Szeredi Date: Fri Sep 28 16:43:22 2018 +0200 fuse: fix blocked_waitq wakeup commit 908a572b80f6e9577b45e81b3dfe2e22111286b8 upstream. Using waitqueue_active() is racy. Make sure we issue a wake_up() unconditionally after storing into fc->blocked. After that it's okay to optimize with waitqueue_active() since the first wake up provides the necessary barrier for all waiters, not the just the woken one. Signed-off-by: Miklos Szeredi Fixes: 3c18ef8117f0 ("fuse: optimize wake_up") Cc: # v3.10 Signed-off-by: Greg Kroah-Hartman commit 0799a93d30588e85eb15775bea286997e796e174 Author: Kirill Tkhai Date: Tue Sep 25 12:52:42 2018 +0300 fuse: Fix use-after-free in fuse_dev_do_write() commit d2d2d4fb1f54eff0f3faa9762d84f6446a4bc5d0 upstream. After we found req in request_find() and released the lock, everything may happen with the req in parallel: cpu0 cpu1 fuse_dev_do_write() fuse_dev_do_write() req = request_find(fpq, ...) ... spin_unlock(&fpq->lock) ... ... req = request_find(fpq, oh.unique) ... spin_unlock(&fpq->lock) queue_interrupt(&fc->iq, req); ... ... ... ... ... request_end(fc, req); fuse_put_request(fc, req); ... queue_interrupt(&fc->iq, req); Signed-off-by: Kirill Tkhai Signed-off-by: Miklos Szeredi Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts") Cc: # v4.2 Signed-off-by: Greg Kroah-Hartman commit 7996b1c0eaefc5745c982301ca483f99555842e4 Author: Kirill Tkhai Date: Tue Sep 25 12:28:55 2018 +0300 fuse: Fix use-after-free in fuse_dev_do_read() commit bc78abbd55dd28e2287ec6d6502b842321a17c87 upstream. We may pick freed req in this way: [cpu0] [cpu1] fuse_dev_do_read() fuse_dev_do_write() list_move_tail(&req->list, ...); ... spin_unlock(&fpq->lock); ... ... request_end(fc, req); ... fuse_put_request(fc, req); if (test_bit(FR_INTERRUPTED, ...)) queue_interrupt(fiq, req); Fix that by keeping req alive until we finish all manipulations. Reported-by: syzbot+4e975615ca01f2277bdd@syzkaller.appspotmail.com Signed-off-by: Kirill Tkhai Signed-off-by: Miklos Szeredi Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts") Cc: # v4.2 Signed-off-by: Greg Kroah-Hartman commit 0dd45a4df3eb901668cafeefb667bf781a527eb3 Author: Quinn Tran Date: Tue Sep 11 10:18:21 2018 -0700 scsi: qla2xxx: shutdown chip if reset fail commit 1e4ac5d6fe0a4af17e4b6251b884485832bf75a3 upstream. If chip unable to fully initialize, use full shutdown sequence to clear out any stale FW state. Fixes: e315cd28b9ef ("[SCSI] qla2xxx: Code changes for qla data structure refactoring") Cc: stable@vger.kernel.org #4.10 Signed-off-by: Quinn Tran Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 0b6393523e59d1abcee494daaa350ef1ed6ed0c0 Author: Himanshu Madhani Date: Fri Aug 31 11:24:27 2018 -0700 scsi: qla2xxx: Fix incorrect port speed being set for FC adapters commit 4c1458df9635c7e3ced155f594d2e7dfd7254e21 upstream. Fixes: 6246b8a1d26c7c ("[SCSI] qla2xxx: Enhancements to support ISP83xx.") Fixes: 1bb395485160d2 ("qla2xxx: Correct iiDMA-update calling conventions.") Cc: Signed-off-by: Himanshu Madhani Signed-off-by: Martin K. Petersen Signed-off-by: Greg Kroah-Hartman commit 8dd745a8799ee01fc67b64fd33cdb44d04eb7e4c Author: Young_X Date: Wed Oct 3 12:54:29 2018 +0000 cdrom: fix improper type cast, which can leat to information leak. commit e4f3aa2e1e67bb48dfbaaf1cad59013d5a5bc276 upstream. There is another cast from unsigned long to int which causes a bounds check to fail with specially crafted input. The value is then used as an index in the slot array in cdrom_slot_status(). This issue is similar to CVE-2018-16658 and CVE-2018-10940. Signed-off-by: Young_X Signed-off-by: Jens Axboe Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 0cf4fa79920cfdff86e2ebef7aadd5a6e3b09b92 Author: Dominique Martinet Date: Tue Aug 28 07:32:35 2018 +0900 9p: clear dangling pointers in p9stat_free [ Upstream commit 62e3941776fea8678bb8120607039410b1b61a65 ] p9stat_free is more of a cleanup function than a 'free' function as it only frees the content of the struct; there are chances of use-after-free if it is improperly used (e.g. p9stat_free called twice as it used to be possible to) Clearing dangling pointers makes the function idempotent and safer to use. Link: http://lkml.kernel.org/r/1535410108-20650-2-git-send-email-asmadeus@codewreck.org Signed-off-by: Dominique Martinet Reported-by: syzbot+d4252148d198410b864f@syzkaller.appspotmail.com Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 9e79094da01fecd4a1fb533e6f495c2a0e8bea75 Author: Dominique Martinet Date: Sat Sep 8 01:18:43 2018 +0900 9p locks: fix glock.client_id leak in do_lock [ Upstream commit b4dc44b3cac9e8327e0655f530ed0c46f2e6214c ] the 9p client code overwrites our glock.client_id pointing to a static buffer by an allocated string holding the network provided value which we do not care about; free and reset the value as appropriate. This is almost identical to the leak in v9fs_file_getlock() fixed by Al Viro in commit ce85dd58ad5a6 ("9p: we are leaking glock.client_id in v9fs_file_getlock()"), which was returned as an error by a coverity false positive -- while we are here attempt to make the code slightly more robust to future change of the net/9p/client code and hopefully more clear to coverity that there is no problem. Link: http://lkml.kernel.org/r/1536339057-21974-5-git-send-email-asmadeus@codewreck.org Signed-off-by: Dominique Martinet Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit abefbf42e4655a4d237748756c35711c857a7a39 Author: Breno Leitao Date: Tue Jul 31 17:55:57 2018 -0300 powerpc/selftests: Wait all threads to join [ Upstream commit 693b31b2fc1636f0aa7af53136d3b49f6ad9ff39 ] Test tm-tmspr might exit before all threads stop executing, because it just waits for the very last thread to join before proceeding/exiting. This patch makes sure that all threads that were created will join before proceeding/exiting. This patch also guarantees that the amount of threads being created is equal to thread_num. Signed-off-by: Breno Leitao Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit c19136a6cd2ef80e12c1bac21be752c34b71c632 Author: Marco Felsch Date: Thu Jun 28 12:20:33 2018 -0400 media: tvp5150: fix width alignment during set_selection() [ Upstream commit bd24db04101f45a9c1d874fe21b0c7eab7bcadec ] The driver ignored the width alignment which exists due to the UYVY colorspace format. Fix the width alignment and make use of the the provided v4l2 helper function to set the width, height and all alignments in one. Fixes: 963ddc63e20d ("[media] media: tvp5150: Add cropping support") Signed-off-by: Marco Felsch Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 64a5369b7e7fdbf5d6dd861e38a8ac9537f4ef89 Author: Phil Elwell Date: Wed Sep 12 15:31:55 2018 +0100 sc16is7xx: Fix for multi-channel stall [ Upstream commit 8344498721059754e09d30fe255a12dab8fb03ef ] The SC16IS752 is a dual-channel device. The two channels are largely independent, but the IRQ signals are wired together as an open-drain, active low signal which will be driven low while either of the channels requires attention, which can be for significant periods of time until operations complete and the interrupt can be acknowledged. In that respect it is should be treated as a true level-sensitive IRQ. The kernel, however, needs to be able to exit interrupt context in order to use I2C or SPI to access the device registers (which may involve sleeping). Therefore the interrupt needs to be masked out or paused in some way. The usual way to manage sleeping from within an interrupt handler is to use a threaded interrupt handler - a regular interrupt routine does the minimum amount of work needed to triage the interrupt before waking the interrupt service thread. If the threaded IRQ is marked as IRQF_ONESHOT the kernel will automatically mask out the interrupt until the thread runs to completion. The sc16is7xx driver used to use a threaded IRQ, but a patch switched to using a kthread_worker in order to set realtime priorities on the handler thread and for other optimisations. The end result is non-threaded IRQ that schedules some work then returns IRQ_HANDLED, making the kernel think that all IRQ processing has completed. The work-around to prevent a constant stream of interrupts is to mark the interrupt as edge-sensitive rather than level-sensitive, but interpreting an active-low source as a falling-edge source requires care to prevent a total cessation of interrupts. Whereas an edge-triggering source will generate a new edge for every interrupt condition a level-triggering source will keep the signal at the interrupting level until it no longer requires attention; in other words, the host won't see another edge until all interrupt conditions are cleared. It is therefore vital that the interrupt handler does not exit with an outstanding interrupt condition, otherwise the kernel will not receive another interrupt unless some other operation causes the interrupt state on the device to be cleared. The existing sc16is7xx driver has a very simple interrupt "thread" (kthread_work job) that processes interrupts on each channel in turn until there are no more. If both channels are active and the first channel starts interrupting while the handler for the second channel is running then it will not be detected and an IRQ stall ensues. This could be handled easily if there was a shared IRQ status register, or a convenient way to determine if the IRQ had been deasserted for any length of time, but both appear to be lacking. Avoid this problem (or at least make it much less likely to happen) by reducing the granularity of per-channel interrupt processing to one condition per iteration, only exiting the overall loop when both channels are no longer interrupting. Signed-off-by: Phil Elwell Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit ff37c427fb4f9cb64c6a765d35560c6a76ac6561 Author: Huacai Chen Date: Sat Sep 15 14:01:12 2018 +0800 MIPS/PCI: Call pcie_bus_configure_settings() to set MPS/MRRS [ Upstream commit 2794f688b2c336e0da85e9f91fed33febbd9f54a ] Call pcie_bus_configure_settings() on MIPS, like for other platforms. The function pcie_bus_configure_settings() makes sure the MPS (Max Payload Size) across the bus is uniform and provides the ability to tune the MRSS (Max Read Request Size) and MPS (Max Payload Size) to higher performance values. Some devices will not operate properly if these aren't set correctly because the firmware doesn't always do it. Signed-off-by: Huacai Chen Signed-off-by: Paul Burton Patchwork: https://patchwork.linux-mips.org/patch/20649/ Cc: Ralf Baechle Cc: James Hogan Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang Cc: Zhangjin Wu Cc: Huacai Chen Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 76c386b3a826112c0ac7c83a8c442c3a2c339379 Author: Joel Stanley Date: Fri Sep 14 13:36:47 2018 +0930 powerpc/boot: Ensure _zimage_start is a weak symbol [ Upstream commit ee9d21b3b3583712029a0db65a4b7c081d08d3b3 ] When building with clang crt0's _zimage_start is not marked weak, which breaks the build when linking the kernel image: $ objdump -t arch/powerpc/boot/crt0.o |grep _zimage_start$ 0000000000000058 g .text 0000000000000000 _zimage_start ld: arch/powerpc/boot/wrapper.a(crt0.o): in function '_zimage_start': (.text+0x58): multiple definition of '_zimage_start'; arch/powerpc/boot/pseries-head.o:(.text+0x0): first defined here Clang requires the .weak directive to appear after the symbol is declared. The binutils manual says: This directive sets the weak attribute on the comma separated list of symbol names. If the symbols do not already exist, they will be created. So it appears this is different with clang. The only reference I could see for this was an OpenBSD mailing list post[1]. Changing it to be after the declaration fixes building with Clang, and still works with GCC. $ objdump -t arch/powerpc/boot/crt0.o |grep _zimage_start$ 0000000000000058 w .text 0000000000000000 _zimage_start Reported to clang as https://bugs.llvm.org/show_bug.cgi?id=38921 [1] https://groups.google.com/forum/#!topic/fa.openbsd.tech/PAgKKen2YCY Signed-off-by: Joel Stanley Reviewed-by: Nick Desaulniers Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 12b49fd34b8ca433c19dedf719faada112c692c4 Author: Dengcheng Zhu Date: Tue Sep 11 14:49:20 2018 -0700 MIPS: kexec: Mark CPU offline before disabling local IRQ [ Upstream commit dc57aaf95a516f70e2d527d8287a0332c481a226 ] After changing CPU online status, it will not be sent any IPIs such as in __flush_cache_all() on software coherency systems. Do this before disabling local IRQ. Signed-off-by: Dengcheng Zhu Signed-off-by: Paul Burton Patchwork: https://patchwork.linux-mips.org/patch/20571/ Cc: pburton@wavecomp.com Cc: ralf@linux-mips.org Cc: linux-mips@linux-mips.org Cc: rachel.mozes@intel.com Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 2a55f0985d484c0f993021923aa5337404d845cd Author: Nicholas Mc Guire Date: Sun Sep 9 12:02:32 2018 -0400 media: pci: cx23885: handle adding to list failure [ Upstream commit c5d59528e24ad22500347b199d52b9368e686a42 ] altera_hw_filt_init() which calls append_internal() assumes that the node was successfully linked in while in fact it can silently fail. So the call-site needs to set return to -ENOMEM on append_internal() returning NULL and exit through the err path. Fixes: 349bcf02e361 ("[media] Altera FPGA based CI driver module") Signed-off-by: Nicholas Mc Guire Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 4bdf5ec12d3c655e5dbead42dd194c094bb6d7b5 Author: Tomi Valkeinen Date: Wed Sep 26 12:11:27 2018 +0300 drm/omap: fix memory barrier bug in DMM driver [ Upstream commit 538f66ba204944470a653a4cccc5f8befdf97c22 ] A DMM timeout "timed out waiting for done" has been observed on DRA7 devices. The timeout happens rarely, and only when the system is under heavy load. Debugging showed that the timeout can be made to happen much more frequently by optimizing the DMM driver, so that there's almost no code between writing the last DMM descriptors to RAM, and writing to DMM register which starts the DMM transaction. The current theory is that a wmb() does not properly ensure that the data written to RAM is observable by all the components in the system. This DMM timeout has caused interesting (and rare) bugs as the error handling was not functioning properly (the error handling has been fixed in previous commits): * If a DMM timeout happened when a GEM buffer was being pinned for display on the screen, a timeout error would be shown, but the driver would continue programming DSS HW with broken buffer, leading to SYNCLOST floods and possible crashes. * If a DMM timeout happened when other user (say, video decoder) was pinning a GEM buffer, a timeout would be shown but if the user handled the error properly, no other issues followed. * If a DMM timeout happened when a GEM buffer was being released, the driver does not even notice the error, leading to crashes or hang later. This patch adds wmb() and readl() calls after the last bit is written to RAM, which should ensure that the execution proceeds only after the data is actually in RAM, and thus observable by DMM. The read-back should not be needed. Further study is required to understand if DMM is somehow special case and read-back is ok, or if DRA7's memory barriers do not work correctly. Signed-off-by: Tomi Valkeinen Signed-off-by: Peter Ujfalusi Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 0e4dde1a6e300ba40d9bdf5dc21bae7cf92a6564 Author: Daniel Axtens Date: Mon Oct 1 16:21:51 2018 +1000 powerpc/nohash: fix undefined behaviour when testing page size support [ Upstream commit f5e284803a7206d43e26f9ffcae5de9626d95e37 ] When enumerating page size definitions to check hardware support, we construct a constant which is (1U << (def->shift - 10)). However, the array of page size definitions is only initalised for various MMU_PAGE_* constants, so it contains a number of 0-initialised elements with def->shift == 0. This means we end up shifting by a very large number, which gives the following UBSan splat: ================================================================================ UBSAN: Undefined behaviour in /home/dja/dev/linux/linux/arch/powerpc/mm/tlb_nohash.c:506:21 shift exponent 4294967286 is too large for 32-bit type 'unsigned int' CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc3-00045-ga604f927b012-dirty #6 Call Trace: [c00000000101bc20] [c000000000a13d54] .dump_stack+0xa8/0xec (unreliable) [c00000000101bcb0] [c0000000004f20a8] .ubsan_epilogue+0x18/0x64 [c00000000101bd30] [c0000000004f2b10] .__ubsan_handle_shift_out_of_bounds+0x110/0x1a4 [c00000000101be20] [c000000000d21760] .early_init_mmu+0x1b4/0x5a0 [c00000000101bf10] [c000000000d1ba28] .early_setup+0x100/0x130 [c00000000101bf90] [c000000000000528] start_here_multiplatform+0x68/0x80 ================================================================================ Fix this by first checking if the element exists (shift != 0) before constructing the constant. Signed-off-by: Daniel Axtens Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 427ccff8aabbbd3ace40edf1a22c0e72809a806a Author: Fabio Estevam Date: Mon Sep 10 14:45:23 2018 -0300 ARM: imx_v6_v7_defconfig: Select CONFIG_TMPFS_POSIX_ACL [ Upstream commit 35d3cbe84544da74e39e1cec01374092467e3119 ] Andreas Müller reports: "Fixes: | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[220]: Failed to apply ACL on /dev/v4l-subdev0: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[224]: Failed to apply ACL on /dev/v4l-subdev1: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[215]: Failed to apply ACL on /dev/v4l-subdev10: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[228]: Failed to apply ACL on /dev/v4l-subdev2: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[232]: Failed to apply ACL on /dev/v4l-subdev5: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[217]: Failed to apply ACL on /dev/v4l-subdev11: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[214]: Failed to apply ACL on /dev/dri/card1: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[216]: Failed to apply ACL on /dev/v4l-subdev8: Operation not supported | Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[226]: Failed to apply ACL on /dev/v4l-subdev9: Operation not supported and nasty follow-ups: Starting weston from sddm as unpriviledged user fails with some hints on missing access rights." Select the CONFIG_TMPFS_POSIX_ACL option to fix these issues. Reported-by: Andreas Müller Signed-off-by: Fabio Estevam Acked-by: Otavio Salvador Signed-off-by: Shawn Guo Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 9206016ce4079b492ad91b7898941fdbf1c8d7fe Author: Miles Chen Date: Mon Oct 8 10:39:17 2018 +0800 tty: check name length in tty_find_polling_driver() [ Upstream commit 33a1a7be198657c8ca26ad406c4d2a89b7162bcc ] The issue is found by a fuzzing test. If tty_find_polling_driver() recevies an incorrect input such as ',,' or '0b', the len becomes 0 and strncmp() always return 0. In this case, a null p->ops->poll_init() is called and it causes a kernel panic. Fix this by checking name length against zero in tty_find_polling_driver(). $echo ,, > /sys/module/kgdboc/parameters/kgdboc [ 20.804451] WARNING: CPU: 1 PID: 104 at drivers/tty/serial/serial_core.c:457 uart_get_baud_rate+0xe8/0x190 [ 20.804917] Modules linked in: [ 20.805317] CPU: 1 PID: 104 Comm: sh Not tainted 4.19.0-rc7ajb #8 [ 20.805469] Hardware name: linux,dummy-virt (DT) [ 20.805732] pstate: 20000005 (nzCv daif -PAN -UAO) [ 20.805895] pc : uart_get_baud_rate+0xe8/0x190 [ 20.806042] lr : uart_get_baud_rate+0xc0/0x190 [ 20.806476] sp : ffffffc06acff940 [ 20.806676] x29: ffffffc06acff940 x28: 0000000000002580 [ 20.806977] x27: 0000000000009600 x26: 0000000000009600 [ 20.807231] x25: ffffffc06acffad0 x24: 00000000ffffeff0 [ 20.807576] x23: 0000000000000001 x22: 0000000000000000 [ 20.807807] x21: 0000000000000001 x20: 0000000000000000 [ 20.808049] x19: ffffffc06acffac8 x18: 0000000000000000 [ 20.808277] x17: 0000000000000000 x16: 0000000000000000 [ 20.808520] x15: ffffffffffffffff x14: ffffffff00000000 [ 20.808757] x13: ffffffffffffffff x12: 0000000000000001 [ 20.809011] x11: 0101010101010101 x10: ffffff880d59ff5f [ 20.809292] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3 [ 20.809549] x7 : 0000000000000000 x6 : ffffff880d59ff5f [ 20.809803] x5 : 0000000080008001 x4 : 0000000000000003 [ 20.810056] x3 : ffffff900853e6b4 x2 : dfffff9000000000 [ 20.810693] x1 : ffffffc06acffad0 x0 : 0000000000000cb0 [ 20.811005] Call trace: [ 20.811214] uart_get_baud_rate+0xe8/0x190 [ 20.811479] serial8250_do_set_termios+0xe0/0x6f4 [ 20.811719] serial8250_set_termios+0x48/0x54 [ 20.811928] uart_set_options+0x138/0x1bc [ 20.812129] uart_poll_init+0x114/0x16c [ 20.812330] tty_find_polling_driver+0x158/0x200 [ 20.812545] configure_kgdboc+0xbc/0x1bc [ 20.812745] param_set_kgdboc_var+0xb8/0x150 [ 20.812960] param_attr_store+0xbc/0x150 [ 20.813160] module_attr_store+0x40/0x58 [ 20.813364] sysfs_kf_write+0x8c/0xa8 [ 20.813563] kernfs_fop_write+0x154/0x290 [ 20.813764] vfs_write+0xf0/0x278 [ 20.813951] __arm64_sys_write+0x84/0xf4 [ 20.814400] el0_svc_common+0xf4/0x1dc [ 20.814616] el0_svc_handler+0x98/0xbc [ 20.814804] el0_svc+0x8/0xc [ 20.822005] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 20.826913] Mem abort info: [ 20.827103] ESR = 0x84000006 [ 20.827352] Exception class = IABT (current EL), IL = 16 bits [ 20.827655] SET = 0, FnV = 0 [ 20.827855] EA = 0, S1PTW = 0 [ 20.828135] user pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____) [ 20.828484] [0000000000000000] pgd=00000000aadee003, pud=00000000aadee003, pmd=0000000000000000 [ 20.829195] Internal error: Oops: 84000006 [#1] SMP [ 20.829564] Modules linked in: [ 20.829890] CPU: 1 PID: 104 Comm: sh Tainted: G W 4.19.0-rc7ajb #8 [ 20.830545] Hardware name: linux,dummy-virt (DT) [ 20.830829] pstate: 60000085 (nZCv daIf -PAN -UAO) [ 20.831174] pc : (null) [ 20.831457] lr : serial8250_do_set_termios+0x358/0x6f4 [ 20.831727] sp : ffffffc06acff9b0 [ 20.831936] x29: ffffffc06acff9b0 x28: ffffff9008d7c000 [ 20.832267] x27: ffffff900969e16f x26: 0000000000000000 [ 20.832589] x25: ffffff900969dfb0 x24: 0000000000000000 [ 20.832906] x23: ffffffc06acffad0 x22: ffffff900969e160 [ 20.833232] x21: 0000000000000000 x20: ffffffc06acffac8 [ 20.833559] x19: ffffff900969df90 x18: 0000000000000000 [ 20.833878] x17: 0000000000000000 x16: 0000000000000000 [ 20.834491] x15: ffffffffffffffff x14: ffffffff00000000 [ 20.834821] x13: ffffffffffffffff x12: 0000000000000001 [ 20.835143] x11: 0101010101010101 x10: ffffff880d59ff5f [ 20.835467] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3 [ 20.835790] x7 : 0000000000000000 x6 : ffffff880d59ff5f [ 20.836111] x5 : c06419717c314100 x4 : 0000000000000007 [ 20.836419] x3 : 0000000000000000 x2 : 0000000000000000 [ 20.836732] x1 : 0000000000000001 x0 : ffffff900969df90 [ 20.837100] Process sh (pid: 104, stack limit = 0x(____ptrval____)) [ 20.837396] Call trace: [ 20.837566] (null) [ 20.837816] serial8250_set_termios+0x48/0x54 [ 20.838089] uart_set_options+0x138/0x1bc [ 20.838570] uart_poll_init+0x114/0x16c [ 20.838834] tty_find_polling_driver+0x158/0x200 [ 20.839119] configure_kgdboc+0xbc/0x1bc [ 20.839380] param_set_kgdboc_var+0xb8/0x150 [ 20.839658] param_attr_store+0xbc/0x150 [ 20.839920] module_attr_store+0x40/0x58 [ 20.840183] sysfs_kf_write+0x8c/0xa8 [ 20.840183] sysfs_kf_write+0x8c/0xa8 [ 20.840440] kernfs_fop_write+0x154/0x290 [ 20.840702] vfs_write+0xf0/0x278 [ 20.840942] __arm64_sys_write+0x84/0xf4 [ 20.841209] el0_svc_common+0xf4/0x1dc [ 20.841471] el0_svc_handler+0x98/0xbc [ 20.841713] el0_svc+0x8/0xc [ 20.842057] Code: bad PC value [ 20.842764] ---[ end trace a8835d7de79aaadf ]--- [ 20.843134] Kernel panic - not syncing: Fatal exception [ 20.843515] SMP: stopping secondary CPUs [ 20.844289] Kernel Offset: disabled [ 20.844634] CPU features: 0x0,21806002 [ 20.844857] Memory Limit: none [ 20.845172] ---[ end Kernel panic - not syncing: Fatal exception ]--- Signed-off-by: Miles Chen Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 9bed31c93660e7af9caeebe8558d170e6f00e200 Author: Sam Bobroff Date: Wed Sep 12 11:23:20 2018 +1000 powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() [ Upstream commit f9bc28aedfb5bbd572d2d365f3095c1becd7209b ] If an error occurs during an unplug operation, it's possible for eeh_dump_dev_log() to be called when edev->pdn is null, which currently leads to dereferencing a null pointer. Handle this by skipping the error log for those devices. Signed-off-by: Sam Bobroff Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman